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Abstract 


The  inverse  gaussian  distribution  can  be  used  to  model  processes  with  skewed  output. 
In  this  thesis,  several  Shewhart  and  CUSUM  control  schemes  are  developed  for  the 
inverse  gaussian  distribution.  The  behavior  of  these  schemes  is  described.  A  new 
type  of  Shewhart  chart,  a  self-starting  Shewhart  chart,  is  developed  and  applied  to 
the  inverse  gaussian  distribution.  A  second  new  type  of  control  chart,  which  controls 
simultaneously  for  multiple  parameters,  is  developed  and  shown  to  have  some  useful 
properties.  Predictive  Shewhart  schemes,  based  on  a  diffuse  prior,  are  developed  and 
used  for  control. 

We  apply  these  methods  to  two  applications.  The  first  examines  the  control 
of  the  distribution  of  the  output  of  a  complex  software  package  representing  military 
combat,  subject  to  continual  revision.  The  second  investigates  the  time  to  complete 
a  task  on  a  General  Motors  assembly  hne. 


Acknowledgements 


I  would  like  to  thank  the  faculty  of  the  School  of  Statistics  of  the  University  of 
Minnesota.  They  have  developed  a  true  community  of  scholars,  a  community  which 
encourages  and  supports  its  graduate  students.  I  have  learned  a  great  deal  about  the 
art  of  teaching  from  them.  Their  enthusiasm  and  style  were  contagious. 

I  especially  thank  Professor  Doug  Hawkins,  my  thesis  advisor,  who  has  guided 
my  studies  with  kindness,  wisdom,  wit,  and  patience  since  1992.  His  encylopedic 
knowledge  of  applied  statistics  and  his  own  extensive  work  in  statistical  process  con¬ 
trol  were  invaluable,  and  shared  graciously.  He  suggested  many  revisions  and  addi¬ 
tions  to  the  work,  which  clarified  and  extended  it.  (Remaining  errors,  if  any,  are  my 
responsibility.)  He  was  sensitive  to  my  military  requirements,  and  made  my  work  a 
priority  when  I  needed  his  quick  assistance.  It  is  not  hyperbole  to  say  that  I  could 
not  have  accomplished  this  work  with  another  advisor. 

I  thank  my  thesis  reviewers.  Professors  Gary  Oehlert  and  John  Anderson,  for 
their  careful  reading  of  the  work,  which  resulted  in  many  improvements. 

I  thank  Professor  Luke  Tierney  for  freely  sharing  his  excellent  Xlisp-Stat  soft¬ 
ware,  used  extensively  in  this  thesis. 

I  thank  my  friends  among  the  graduate  students  of  the  School  for  their  help 
over  the  years  as  we  studied  together  -  particularly  Marilyn  Agin,  Efstathia  Bura, 
Francesca  Chiaromonte,  Wen-Lin  Lai,  Bret  Musser,  and  Dave  Nelson. 

The  staff  of  the  School  of  Statistics  has  been  genuinely  helpful,  and  I  thank 
them  for  their  many  kindnesses. 

I  have  enjoyed  my  contacts  with  the  other  units  of  the  University  during  my 
graduate  schooling,  particularly  the  School  of  Mathematics,  where  I  earned  a  Master’s 


IV 


degree  in  1989;  the  Department  of  Mechanical  Engineering,  where  I  earned  a  minor  in 
1989;  the  Carlson  School  of  Business,  where  I  studied  quality  management  methods 
and  theory  in  1993  and  1994;  and  the  Army  High  Performance  Computing  Research 
Center,  which  provided  me  a  “home  away  from  home”,  computing  support,  and 
administrative  assistance. 

I  am  grateful  to  Mr.  Tom  Herbert  of  the  Rand  Corporation  for  providing 
gratis  one  of  the  data  sets  analyzed  in  Chapter  7,  and  to  Ms.  Angela  Stich,  also  of 
Rand,  who  compiled  the  files  and  assisted  in  the  post-processing.  Mr.  David  Durda, 
of  the  US  Army  Training  and  Doctrine  Analysis  Center,  White  Sands  Missile  Range, 
provided  a  second  data  set  with  post-processing. 

Colonel  Chris  Arney  and  the  faculty  of  the  Department  of  Mathematical  Sci¬ 
ences  at  West  Point  have  been  extremely  supportive,  allowing  me  a  reduced  teaching 
load  as  I  finished  this  thesis. 

My  wife,  Karen,  and  my  children  have  been  supportive  these  four  years  as  I 
studied  and  was  frequently  absent.  Karen  has  tolerated  wdth  humor  and  patience 
my  preoccupied  state  while  wrestling  with  the  problems  of  this  thesis.  Karen  also 
proofread  four  drafts  of  this  thesis,  with  painstaking  attention  to  detail.  I  thank  her 
for  ail  her  assistance. 

Finally,  I  will  forever  be  grateful  to  Brigadier  General  (retired)  Frank  R.  Gior¬ 
dano,  my  former  teacher  as  a  cadet  and  my  boss  during  my  first  three  year  tour  as  an 
instructor  at  West  Point.  He  gave  me  freedom  to  experiment.  General  Giordano  ac¬ 
cepted  my  mistakes  and  turned  them  into  learning  opportunities.  He  selected  me  for 
this  advanced  schooling,  and  has  “set  the  example”  of  soldier-scholar  for  me.  His  love 
of  West  Point,  love  of  mathematics,  and  love  of  cadets  have  been  deeply  inspirational 
to  me,  and  his  faith  in  me  has  changed  my  life. 


V 


Contents 


Abstract 


Acknowledgement  s 


1  Problem  statement 

1.1  Control  Charts . 

1.1.1  Shewhart  charts . 

1.1.2  Cumulative  Sum  (CUSUM)  control  charts . 

1.2  Inverse  Gaussian  processes . 

1.2.1  Well  Known  Properties . 

1.2.2  Modeling  advantages  over  other  skewed  distributions . 

1.3  Scope  of  this  thesis . 

2  Shewhart  Control  Charts  for  IG  Processes 

2.1  Edgeman’s  work . . . 

2.2  Improvements . . 

2.3  Simplification . 

2.3.1  Average  Run  Lengths  . 

2.4  Self-starting  Shewhart  Charts . 

2.4.1  Self-Staxting  Shewhart  charts  for  location  for  the  A)  .  . 

2.4.2  Self-Starting  Control  charts  for  shape  for  the  /G(ju,  A) . 

2.5  Conclusion . 

3  Predictive  control  charts  for  the  IG 

3.1  Sample  of  size  one  . 


1 

1 

2 

8 

16 

18 

24 

27 

29 

29 

33 

33 

36 

36 

39 

43 

46 

48 

49 


VI 


50 


3.1.1  An  example . 

3.1.2  Comparison  with  the  self-starting  scheme . 

3.2  Sample  of  size  . . 

3.3  Loss  functions . 

3.3.1  Lin-quad  loss . 

3.3.2  LINEX  Loss . 

3.4  Conclusions . 

4  Cumulative  Sum  Charts  for  IG  Processes 

4.1  Scheme  construction . 

4.2  CUSUMs  for  location . 

4.3  CUSUMs  for  shape . 

4.4  ARLs  in  control . 

4.4.1  Two  pedagogical  notes . 

4.4.2  Comments  on  ARL  as  a  measure  of  effectiveness  of  a  CUSUM 

scheme . 

4.5  Performance  for  small  persistent  shifts  . 

4.5.1  Small  persistent  shifts  in  /i  .  .  . . 

4.5.2  Small  persistent  shifts  in  A . 

4.6  Conclusions . 

5  CUSUM  Embellishments  for  IG  processes 

5.1  Fast  initial  response  CUSUM  . 

5.1.1  An  example . 

5.2  Self-starting  CUSUM  for  the  mean . 

5.3  Conclusions . 


52 

53 

57 

58 
60 
60 

62 

62 

64 

66 

69 

72 

72 

73 

73 

74 

76 

77 

77 

78 
81 
86 


vn 


6  Bivariate  Shewhart  control  charts  87 

6.1  Current  practices . 

6.2  Improved  diagnosis  of  an  out-of-control  signal .  89 

6.3  HPD  bivariate  control  regions .  91 

6.3.1  An  example .  97 

6.4  Implementation  of  basic  scheme .  97 

6.5  Comparison  with  traditional  charts .  99 

6.5.1  Normal  case .  99 

6.5.2  IG  case  .  199 

6.5.3  An  example .  199 

6.6  Interpretation  of  a  signal . 

6.7  Conclusions . 

7  Combat  Models 

7.1  Background  . .  H"! 

7.1.1  Underlying  hypothesis  of  Brownian  motion  for  combat  models  116 

7.1.2  Program  ma,intenance  and  unintended  effects .  119 

7.2  Goodness  of  fit  .  . .  121 

7.2.1  Goodness  of  fit  of  IG  model .  123 

7.3  Results .  128 

7.3.1  Self  starting  Shewhart  Charts .  126 

7.3.2  Predictive  Shewdiart  Charts .  129 

7.4  Self-starting  CUSUM  charts  for  the  mean .  129 

7.5  Conclusions .  189 

8  General  Motors  1-87 

8.1  Background . 187 


Vlll 


8.2  Results .  140 

8.2.1  Shewhart  Chart  for  the  mean .  141 

8.2.2  Shewhart  chart  for  A .  141 

8.2.3  HPD  chart  for  the  mean . 144 

8.2.4  HPD  chart  for  A  .  144 

8.2.5  CUSUM  chart  for  the  mean .  145 

8.2.6  CUSUM  chart  for  A .  145 

8.3  Conclusions .  147 

9  Conclusions  149 

9.1  Summary  and  significance .  149 

9.2  Future  work .  ISO 

A  Statistical  Computing  152 

A.l  Generating  IG  variates .  152 

A.2  CUSUM  ARL  FORTRAN  routines  with  explanation  .  153 

A.2.1  CUSARL  . .  153 

A. 2. 2  Finding  ARL  for  shifts  in  fx  and  A:  ARLl  -  8.f .  154 

A. 2. 3  Finding  decision  hmits  for  a  CUSUM  scheme .  155 

A. 3  Variance  reduction  techniques  for  IG  ARLs  with  code .  155 

A.  4  Data .  155 

B  BIBLIOGRAPHY  156 

B. l  Works  Cited .  156 

B.2  Other  references .  163 


ix 


List  of  Tables 


2.1  Comparison  of  Edgeman,  Symmetric,  and  HPD  Schemes,  I .  37 

2.2  Comparison  of  Edgeman,  Symmetric,  and  HPD  Schemes,  II  .  37 

2.3  Self-starting  results  for  a  data  stream  with  early  outlier,  o:  =  .01.  .  .  43 

2.4  Self-starting  results  for  a  data  stream  with  later  outlier,  a:  =  .01.  .  .  .  45 

3.1  Representative  predictive  results  for  early  outlier .  51 

3.2  Representative  predictive  results  for  later  outlier .  51 


4.1  Some  in-control  and  out-of  control  ARL  values  for  various  CUSUM 

parameters.  Out-of-control  values  are  taken  for  the  parameter  at  the 
alternate  (tuning)  value . 

4.2  Comparing  performance  by  ARL  of  various  control  schemes  to  detect 

a  small  persistent  shift  in  the  mean . 

4.3  Out-of-control  ARLs  for  a  small  persistent  shift  in  A.  Note  that  a  small 

persistent  change  in  A  can  be  very  difficult  to  detect . 

5.1  Table  of  comparable  values  for  regular  and  FIR  CUSUMs . 

6.1  Comparison  of  ARLs  for  bivariate  Shewhart  and  a  pair  of  standard 

charts.  . . 

8.1  Table  of  ARIjS  for  corrected  Edgem.an  Shewhart  Charts . 

8.2  ARLs  for  Shewhart  scheme  for  A,  CM  data . 

8.3  ARL  table  for  HPD  scheme  for  mean  for  CM  data . 

8.4  ARL  table  for  HPD  scheme  for  A  for  CM  data . 

8.5  Table  of  ARLs  for  CUSUM  for  the  mean,  CM  exam-ple . 


69 

74 

75 
80 


101 

141 

142 

145 

146 
146 


X 


8.6  Table  of  ARLs  for  CUSUM  for  A,  GM  example 


147 


XI 


List  of  Figures 


1.1  A  Shewhart  control  chart  for  the  sample  average,  with  the  process 

in-control . ^ 

1.2  A  Shewhart  chart  for  the  mean  for  a  process  out-of-control  due  to  a 

location  shift .  ^ 

1.3  A  Shewhart  chart  for  the  mean  for  a  process  out-of-control  due  to 

increased  variabihty . .  •  • 

1.4  A  graphical  description  of  the  SPRT .  10 

1.5  A  CUSUM  chart,  using  the  decision  interval  format .  13 

1.6  A  multiple  chart,  showing  both  Shewhart  charts  for  location  and  scale, 

and  the  4  CUSUMs  for  location  and  scale .  15 

1.7  First  passage  time  illustration  for  Brownian  motion  with  drift  ....  17 

1.8  A  sheaf  of  1)  densities  for  =  .5, 1, 1.5, 2,  5,  and  10 .  19 

1.9  A  sheaf  of  10(5,  A)  densities  for  A  =  1,  2,  5, 10,  and  25 .  20 

1.10  An  example  of  the  predictive  density  for  the  next  observation  from  an 

IG(n,  A),  for  five  previous  points:  {xi,=  3,  X2  =  4,  0:3  =  6,  2:4  =  3.5,  x^  =  2.5}.  25 

2.1  Illustration  of  the  HPD  region  for  a  skewed  distribution .  35 

2.2  Control  chart  for  the  self-starting  example  with  early  outlier,  a  =  .01.  42 

2.3  Control  chart  for  the  self-starting  example  with  later  outlier,  a  =  .01.  44 

3.1  Linear-quadratic  loss  function  for  target  value  0 .  59 

3.2  Linear-exponential  loss  function  for  target  value  0.  . .  61 

4.1  A  plot  of  (X,  S)  from  an  IG(3,  5} .  63 

4.2  A  chart  of  h  vs.  ln{ARL),  CUSUM  for  the  mean .  70 


XU 


4.3  A  chait  of  h  vs  ARL,  CUSIJM  for  A .  71 

5.1  Comparison  of  FIR  and  regular  ARI.  schemes .  79 

5.2  Self-starting  CUSUM  chart .  84 

5.3  Self-starting  CUSUM  for  the  mean,  longer  training  set  .  85 

6.1  Bivariate  chart  with  rectangular  limits .  88 

6.2  Diagnostic  regions  for  bivariate  Shewhart  chart  .  92 

6.3  Joint  distribution  of  X,  V,  and  f{X,V) .  93 

6.4  HPD  out-of-control  region  by  simulation .  94 

6.5  HPD  rejection  region .  95 

6.6  Level  curves  for  the  bivariate  Shewhart  chart  normal  example  ....  98 

6.7  Bivariate  Control  chart  for  Norm.al  Example,  up  to  observation  9.  .  .  102 

6.8  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  9.  .  .  103 

6.9  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  11.  .  104 

6.10  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  12.  .  105 

6.11  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  13.  .  106 

6.12  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  14.  .  107 

6.13  Bivariate  Control  chart  for  Example,  up  to  observation  15.  The  process 

is  out  of  control,  and  has  now  signaled  out-of-control.  Applying  the 
rules  developed  in  Figure  6.2,  we  diagnose  a  mean  shift  only . 108 

6.14  Bivariate  HPD  regions  for  an  IG(3,  5) .  110 

6.15  Long  run  Bivariate  HPD  chart  for  7G(3,5)  with  1000  observation. 

Only  the  last  9  in-control  points  are  plotted.  There  are  10  outliers, 
marked,  with  diamonds .  HI 

6.16  Out-of-control  bivariate  HPD  chart,  with  1000  observations .  112 


7.1  Example  of  the  tail  behavior  when  an  JG  random  variable  is  plotted 

on  a  log-normal  probability  plot .  118 

7.2  Example  of  the  tail  behavior  when  the  reciprocal  of  an  /G  random 

variable  is  plotted  on  a  log-normal  probability  plot .  120 

7.3  A  histogram  of  force  exchange  ratios  from  the  Blue  Defense  vignette 

for  the  M1A2  lOTE.  The  fitted  IG  distribution  has  been  superimposed.  122 

7.4  Boxplot  of  the  four  data  sets .  125 

7.5  Self-starting  Shewhart  chart  for  the  mean .  127 

7.6  Self-starting  Shewhart  chart  for  A .  128 

7.7  Self-starting  Shewhart  chart  for  n  with  bootstrap .  130 

7.8  Predictive  Shewhart  chart .  131 

7.9  Predictive  Shew'hart  chart  with  bootstrapping .  132 

7.10  Self-starting  CUSUM  of  base  case  and  historical  case .  133 

7.11  Self-starting  CUSUM  of  base  case  and  first  model  change .  134 

7.12  Self-starting  CUSUM  of  base  case  and  second  model  change .  135 

8.1  Density  for  an  /G(42.6257,  66.282)  distribution .  139 

8.2  Density  of  V  in  and  out  of  control,  with  control  limits  superimposed.  143 


Chapter  1 

Problem  statement 

1.1  Control  Charts 

A  control  chart  is  a  graphical  tool  for  detecting  departures  from  an  assumed  distri¬ 
bution.  Charts  are  used  to  display  information  taken  from  samples  from  the  ongoing 
process,  and  signal  by  various  algorithms  when  it  is  likely  that  a  model  departure 
exists. 

There  are  two  broad  types  of  control  charts.  The  first  type,  due  to  Shewhart 
[Shewart,  1931],  measures  when  individual  samples  of  fixed  size  indicate  a  departure 
from  the  model.  For  example,  a  power  spike  occurs  during  the  manufacturing  process, 
causing  the  characteristic  (s)  of  interest  of  one  batch  of  output  to  follow  a  different 
distribution  from  historical  patterns.  Shewhart  charts  are  simple,  and  signal  large 
changes  quickly. 

The  second  type  of  control  chart,  due  to  Page  [Page,  1954],  detects  small,  per¬ 
sistent  changes  in  the  process  output.  Page  introduced  the  Cumulative  Sum  Chart 
(CUSUM).  Another  variation  on  the  idea  is  the  Exponentially  Weighted  Moving  Av¬ 
erage  (EWMA),  due  to  Roberts  [Roberts,  1959].  These  charts  use  the  previous  history 
of  the  process  and  provide  a  quicker  method  of  detecting  small  changes  in  the  process. 

One  can  chart  different  attributes  of  the  process.  Historically,  one  has  charted 
measures  of  process  location  and  dispersion. 

Control  charts  have  been  used  extensively  in  manufacturing  industries,  chem¬ 
ical  industries,  and  business.  The  most  common  assume  that  the  underlying  process 
can  be  well  modeled  by  the  normal  distribution. 
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Control  chart  methodologies  have  been  developed  for  processes  modeled  by 
many  members  of  the  exponential  family,  including  the  normal,  gamma,  poisson, 
and  binomial.  In  1989,  Edgeman  proposed  Shewhart  control  charts  for  the  inverse 
gaussian  (IG)  distribution  [Edgeman,  1989]. 

The  Shewhart  charts  suffer  from  limitations.  A  signal  for  a  change  in  location 
can  mean  either  a  change  in  process  mean,  or  an  increase  in  process  variability. 
This  is  discussed  below.  Accordingly,  one  must  consider  both  charts  together  when 
diagnosing  shifts  in  process  centrality.  Furthermore,  they  are  slow  to  signal  small 
changes  in  the  distribution. 

We  shall  extend  existing  theory  to  include  CUSUM  schemes  for  inverse  gaus¬ 
sian  distributed  random  variables.  We  explore  their  properties.  Later,  we  will  propose 
new  methods  for  constructing  Shewhart  charts,  and  explore  their  properties. 

1.1.1  Shewhart  charts 

Many  excellent  references  describe  the  basic  Shewhart  control  scheme.  They  include 
Montgomery  [1991]  and  Barnard  [1959],  among  others.  We  survey  the  general  ap¬ 
proach. 

Prom  historical  data  or  first  principles,  one  models  the  distribution  of  the 
process  for  a  given  characteristic  when  it  is  **in-control” .  *ln-control  means  that 
the  process  is  following  the  postulated  distribution.  Frequently,  the  process  is  taken 
to  be  normally  distributed,  but  there  are  schemes  for  discrete  distributions  as  well 
as  for  non-normal  interval  data.  The  process  characteristic  could  be  some  physical 
dimension  of  the  output  of  the  process,  or  the  number  of  defects,  or  whatever  is  of 
interest  to  those  monitoring  the  process. 

We  draw  samples  from  the  process  upon  which  to  base  our  inference  about 
the  process  characteristic.  If  we  draw  samples  of  size  greater  than  one,  we  desire  to 
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obtain  a  “rational  subgroup” .  A  rational  subgroup  is  a  sample  which  is  expected  to 
be  homogenous.  This  is  best  illustrated  by  counter-example:  a  sample  which  spanned 
two  shifts  of  workers  would  not  be  expected  to  be  completely  homogenous,  because  of 
worker  to  worker  variation.  A  rational  subgroup  has  no  plausible  a  priori  explanation 
for  being  a  mixed  distribution  of  in-control  and  out-of-control. 

Prom  the  distribution  of  the  characteristic,  we  determine  the  sampling  distri¬ 
bution  for  our  sample  size.  We  then  construct  limits  on  acceptable  values  for  sufficient 
statistics  for  drav/s  from  the  sampling  distribution,  using  either  probability  limits  or 
a  rule  of  thumb  based  on  a  given  number  of  standard  deviations.  We  construct  the 
control  chart  by  plotting  the  sufficient  statistic  vs.  draw  number  on  a  chart  which 
already  has  graphed  the  expected  value  of  the  statistic,  the  upper  control  limit,  and 
the  lower  control  limit.  See  Figure  1.1  for  an  example. 

If  the  sample  statistic  is  within  the  control  limits,  we  take  no  action.  If  the 
statistic  is  outside  the  limits,  we  consider  that  an  “out-of-control”  signal.  We  reject 
the  hypothesis  that  the  process  is  still  following  tire  postulated  model,  and  investigate. 

Analogous  with  hypothesis  testing,  we  are  concerned  with  two  types  of  errors. 
First,  the  process  may  signal  a  model  departure  when  none  exists.  If  the  control 
limits  are  constructed  so  that  the  probability  that  the  sample  is  within  the  limits  is 
1  —  cc,  then  the  probability  of  a  false  signal  is  ex.  The  number  of  samples  to  a  false 
signal  is  ca.lled  the  Run  Length.  The  distribution  of  the  run  length  for  a  process 
ia-control  is  geometric.  The  expected  number  of  samples  until  a  false  signal  is  called 
the  Average  Run  Length  (ARL)  and  is  equal  to  This  is  the  traditional  measure 
of  effectiveness  for  a  control  chart.  Long  ARL  is  desirable  when  in-control,  for  there 
are  costs  associated  with  investigating  false  signals. 

Given  a  model  departure  of  specified  form  and  magnitude,  we  can  compute 
the  probability  of  a  signal,  and  the  corresponding  ARL  for  detecting  the  model  depar¬ 
ture.  When  out-of-control,  short  ARLs  are  desirable,  since  there  are  (usually)  costs 
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Figure  1.1:  A  Shewhart  control  chart  for  the  sample  average,  with  the  process  in- 
control. 
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associated  with  model  departures  of  various  sizes.  We  can  construct  the  operating 
characteristic  curve  as  in  classical  hypothesis  testing. 

It  is  accepted  practice  to  chart  for  both  location  and  scale,  when  dealing 
with  normal  processes.  Consider  a  shift  from  the  distribution  Xi  ~  N  (^,  cr)  to  the 
distribution  Xo  ~  A  x  a^),  A  >  1.  A  chart  for  location  alone  would  eventually 
detect  such  a  shift,  but  it  would  not  be  efficient.  Worse,  such  a  signal  might  be 
interpreted  as  a  mean  shift,  not  a  scale  shift.  This  is  illustrated  in  Figures  1.2  and 
1.3.  Notice  both  location  and  scale  shifts  are  signaled  on  the  location  chart.  The 
scale  shift  signals  because  the  increased  variability  increases  the  probability  that  the 
process  will  exceed  the  control  limits. 

Conventional  practice  interprets  signals  as  follows:  if  there  is  a  signal  on  the 
scale  chart,  consider  that  a  shift  in  scale  has  occurred.  If  there  is  a  signal  on  the 
mean  chart  only,  consider  that  a  shift  in  mea.n  has  occurred.  If  there  is  a  signal  in 
both  charts,  it  may  be  due  to  either  a  shift  in  scale  alone,  or  a  shift  in  both  scale  and 
location.  Investigate  both  possibilities. 

As  with  classic  hypothesis  testing,  there  is  a  tradeoff  between  the  probability 
of  type  I  and  type  II  error,  here  given  by  the  relative  size  of  the  ARL  in-control  vs. 
ARL  out-of-control  for  various  model  departures. 

Depending  on  our  criteria,  we  select  sample  size,  sample  frequency,  and  the 
control  limits  to  best  meet  our  needs.  The  criteria  can  be  economic,  and  incorporate 
different  losses  for  false  positive  signals,  false  negative  signals,  and  sampling  costs. 
The  criteria  can  be  strictly  statistical.  Optimal  design  against  these  criteria  is  dis¬ 
cussed  by  many  authors,  including  Girshick  and  Rubin,  [1952],  Bather,  [1963],  Ross 
[1971],  Savage[1962],  and  Taylor  [1965,  1967].  A  good  survey  is  given  in  Montgomery 
[1991]. 

The  advantages  of  Shewhart  charts  are  simplicity  and  immediate  sensitivity  to 
large  model  departures.  However,  for  small  model  departures,  the  ARL  until  a  signal 
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Figure  1.2:  A  Shewhart  chart  for  the  mean  for  a  process  out-of-control  due  to  a 
location  shift. 
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can  be  quite  large.  Depending  on  the  losses  associated  with  large  out-of-control  ARLs, 
this  can  be  costly  for  the  process  manager. 

1.1.2  Cumulative  Sum  (CUSUM)  control  charts 

The  Shewhart  chart  tests  the  hypothesis  that  the  sample  came  from  the  “in-control” 
distribution.  Treating  each  sample  separately,  it  does  not  use  all  the  information 
available  in  the  case  where  the  out-of-control  affects  more  than  one  rational  sub¬ 
group.  Various  rules  for  declaring  signals  that  the  process  is  out-of-control  have  been 
proposed  to  use  more  of  the  data  available.  For  example,  one  author  proposes  the 
process  be  declared  out-of-control  if  8  points  in  sequence  are  above  the  centerline  of 
the  chart,  reasoning  that  the  probability  that  8  consecutive  independent  draws  from 
a  distribution  are  above  the  median  is  so  unlikely  (1/2®  =  .0039)  as  to  signal  a  depar¬ 
ture  from  the  null  hypothesis  [Western  Electric,  1956].  More  generally,  the  Western 
Electric  rules  signal  the  process  out-of-control  if  k  out  of  N  of  the  preceding  points  are 
above  or  below  the  median.  Champ  and  Woodall  [1987,  1990]  discuss  supplementary 
rules  and  provide  methods  for  determining  ARLs. 

The  cumulative  sum  chart  generalizes  this  idea.  It  works  with  the  sum  of 
previous  observations  (or  transformations  of  observations.)  As  A.L.  Goel  says, 

This  system  of  charting  takes  full  advantage  of  the  historical  record 
and  provides  a  rapid  means  of  detecting  shifts  in  the  process  level.  [Goel, 

1982] 

In  other  words,  the  Shewhart  chart  is  designed  to  detect  large,  isolated  shifts 
in  the  process.  The  CUSUM  is  designed  to  detect  persistent,  perhaps  small,  shifts  in 
the  process. 

Standards  references  for  the  CUSUM  include  Van  Dobben  de  Bruyn  [1966], 
Johnson  [1961],  and  Johnson  and  Leone  [1962a,  1962b,  1962c]. 
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The  roots  of  the  CUSUM  lay  with  the  sequential  probability  ratio  test  (SPRT). 
This  test  was  developed  by  Abraham  Wald  while  at  the  Statistical  Research  Group 
at  Columbia  during  the  Second  World  War.  It  is  interesting  (to  this  candidate)  that 
he  attributes  the  genesis  of  the  theory  “in  connection  with  some  comments  made  by 
Captain  G.  L.  Schuyler  of  the  Bureau  of  Ordnance,  Navy  Department.”  [Wald,  1947]. 

Wald  published  on  the  theory  of  cumulative  sums  of  random  variables  [Wald, 

1944]. 

Page  [Page,  1954]  adapted  these  methods  to  quality  control,  proposing  the  first 
quality  charts  based  on  cumulative  sums  of  observations  and  calling  them  “CUSUM” 
charts. 

We  will  use  the  following  decision  interval  characterization  of  the  CUSUM. 
Consider  the  model  where  X  has  density  /(a;|0),  i.e.  A  is  a  random  variable  whose 
distribution  is  indexed  by  the  parameter  9.  We  consider  9  known  and  fixed  at  a 
value,  say  ^o-  We  observe  the  sequence  Xi,  X2,  •  ■  •  >  Xn-  We  are  interested  in  testing 
the  hypothesis  that  9  =  9q  against  the  alternative  9  ^  ^o-  Instead  of  dealing  with 
a  composite  alternative  hypothesis,  we  consider  two  point  alternatives;  9  =  9i  or 
9  —  9y,. 

Wald  conjectured  [Wald,  1947]  and  later  proved  [Wald  and  Wolfowitz,  1948] 
that  the  sequential  probability  ratio  test  was  optimal  for  deciding  between  the  two 
point  hypotheses  in  the  sense  that  the  expected  number  of  points  sampled  before  a 
decision  could  be  reached  was  minimized  with  the  SPRT.  A  precise  statement  of  these 
optimality  properties  of  the  SPRT  in  a  decision  framework  can  be  found  in  [Ferguson, 
1967]. 

The  SPRT  considers 

_  / (-^1)  -^2)  •  •  ■  )  Nn\9l)  _  fiXi\9i) 

“  /(Xi,  X2, . . . ,  XnW,  ~  i\  f{Xi\9o) 

where  f{x\9)  is  the  joint  or  marginal  density  as  appropriate.  The  SPRT  accepts 
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Figure  1.4:  A  graphical  description  of  the  SPRT. 

iJo  :  ^  =  00  if  A„  <  A,  accepts  Ha  :  6  =  &!  K  >  B  and  otherwise  continues 
sampling.  This  is  illustrated  in  Figure  1.4,  with  A  —  —3  and  5=3,  where  the  null 
hypothesis  would  have  been  rejected  at  observation  number  4. 

In  practice,  we  work  with  the  log-likelihood,  or  ln(A„),  which  results  in  a 
cumulative  sum.  We  accept,  reject,  or  continue  sampling  based  on  the  value  of  this 
cumulative  sum.  As  we  have  written  it,  the  log-likelihood  ratio  will  have  a  negative 
expected  value  when  the  process  is  in-control.  When  the  process  is  well  modeled 
by  the  alternate  hypothesis,  the  log-likelihood  ratio  will  have  a  positive  expected 
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value.  As  a  result,  when  the  process  is  in-control,  the  sum  tends  downward.  When 
the  process  is  out-of- control  at  the  alternative  distribution,  the  sum  tends  upward. 
When  the  sum  is  above  a  certain  limit,  we  have  evidence  in  favor  of  the  alternative 
hypothesis.  When  the  sum  is  below  a  certain  hmit,  we  decide  in  favor  of  the  null 
hypothesis.  When  the  sum  is  in-between  the  limits,  we  continue  to  sample. 

The  CUSUM  is  based  on  this  test,  except  that  the  null  hypothesis  is  never 
accepted.  Instead,  we  restart  the  test  each  time  the  evidence  favors  the  null  hypoth¬ 
esis.  We  consider  the  evidence  as  favoring  the  null  hypothesis  whenever  the  sum  is 
negative.  At  that  point,  we  start  over  by  resetting  the  sum  to  zero.  We  sample  until 
we  reject  the  null  hypothesis  in  favor  of  the  point  alternative. 

For  members  of  the  one-parameter  exponential  family,  we  have  the  following 
general  case: 

hif{x\d)  =  a{x)b{9)  +  c{x)  +  d{9) 

Then  the  log-likelihood  ratio  SPRT  implies: 


a  <  In  A„  <  i) 


(1.2) 


a  <  '^a{xi)b{9i)  +  c{xi)  -!-  ^(^i)  -  (^a{xi)b{9Q)  -h  c(xi)  +  d{9o))  <  b 

i=l  i=l 

a _  .  A  /  .  ,  .  ^(^i)  -  di9o)\  b 

H^i)  -  ^  V  ”  b{9i)  -  b{9o) 


(1.3) 


Accordingly,  the  SPRT  is  equivalent  to  testing  if  a'  <  +  k  <  b',  where 


b 

d'(9i)  -  d(9G) 
'b{9i)-b{9o) 


(1.4) 

(1.5) 

(1.6) 
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The  test  statistic  in  the  SPRT  is  the  cumulative  sum,  E?=i(a(a^i)  +  k). 

The  CUSUM  differs  from  the  SPRT  is  two  ways.  First,  we  never  accept  the 
null  hypothesis,  so  anytime  the  CUSUM  is  negative,  which  favors  the  null  hypothesis, 
we  start  over.  In  other  words,  the  indices  for  the  sum  are  different.  Less  importantly, 
the  way  we  have  constructed  the  SPRT  rejects  the  null  hypothesis  if  the  CUSUM 
is  greater  than  a'  and  b{9i)  >  b(9o)-  We  prefer  this  rejection  region  on  the  positive 
side.  Reversing  the  roles  of  Oq  and  9i  if  necessary  accomplishes  that  for  the  case 
b(9i)  <  b{9o).  Since  k  is  usually  negative,  we  also  change  the  sign  of  k  in  our  notation. 
This  notation  is  consistent  with  the  industry  standard,  described  in  Van  Dobben  de 
Bruyn  [1966]. 

The  scheme  then  becomes: 

5J-  =  0  (1.7) 

S:^  =  msix(0,S^_iPa{xn)  -  k'^)  (1.8) 

and  signals  when  5+  >  h'^. 

Say  in  the  case  above,  9i  >  9o.  We  can  also  construct  a  scheme  to  detect  the 
changes  in  the  other  direction,  for  92  <  9o-  The  second  scheme,  for  a  shift  in  the 
opposite  direction,  is: 


5o-  0  (1-9) 

S~  =  S~_i  +  a{xn)  -  k~)  (1-10) 

This  chart  signals  if  5]^  <  /i~.  Many  practitioners  change  the  sign  of  A:"  to  be  positive 
if  it  is  negative,  so  one  frequently  sees  Equation  1.10  written  as  =  min(0,  S^_i  + 
A  k  ). 

We  run  both  schemes  simultaneously  to  detect  changes  in  9.  This  results  in  a 
pair  of  schemes  being  plotted  on  the  same  chart,  as  illustrated  in  Figure  1.5. 


CUSUH 
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Figure  1.5:  A  CUSUM  chart,  using  the  decision  interval  format. 
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We  can  also  decide,  based  on  costs  of  being  out-of-control  in  one  direction  or 
the  other,  to  set  asymmetric  values  of  6>i  and  6»2,  resulting  in  asymmetric  values  of 
and  h~. 

The  upward  and  downward  CUSUMs  will  each  have  its  own  ARL.  If  -|- A;~  > 
—  h~\y  then  overall  ARL  for  the  two-sided  scheme  is  ; 


ARL  - 


ARL 


•Slower 


ARLu 


(1.11) 


[Van  Dobben  de  Bruyn,  1966]. 

With  uniform  scaling,  it  is  possible  to  run  both  the  Shewhart  and  the  CUSUM 
schemes  on  the  same  chart.  Some  practitioners  recommend  putting  as  many  as  6 
charts  on  the  same  graph,  for  the  two  parameter  normal  distribution:  A  Shewhart 


chart  for  location,  another  Shewhart  chart  for  scale,  two  CUSUMs  for  location  (upper 
and  lower),  and  two  CUSUMs  for  scale  (upper  and  lower).  This  is  illustrated  in 
Figure  1.6.  About  such  multiple  charts,  Hawkins  [1992b]  says. 


Suprisingly,  this  chart  with  up  to  6  things  plotted  does  not  contain 
much  ‘clutter’.  Under  control,  each  of  the  four  cusums  spend  much  of  its 
time  running  along  the  axis,  and  the  two  positive  and  negative  cusums  are 
confined  to  their  own  halves  of  the  paper.  Showing  them  as  solid  lines  and 
adding  the  Shewhart  points  as  individual  unconnected  symbols  generally 
gives  quite  a  clear  presentation. 

Moustakides  [1986]  showed  that  the  CUSUM  scheme  enjoyed  the  same  opti¬ 
mality  properties  as  the  SPRT.  Among  all  tests  with  the  same  in-control  ARL,  the 
CUSUM  had  the  smallest  expected  run  length  out-of-control.  The  reader  is  referred 
to  Moustakides  for  a  precise  statement  and  proof. 

An  interesting  interpretation  of  the  Shewhart  chart  classifies  it  as  a  CUSUM 
with  /i  =  0  and  A:  =  the  control  limit.  Then  the  Shewhart  chart  is  a  CUSUM  which 
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Figure  1.6:  A  multiple  chart,  showing  both  Shewhart  charts  for  location  and  scale, 
and  the  4  CUSUMs  for  location  and  scale. 
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signals  immediately  if  a (X)  >  A:,  but  restarts  for  any  observation  less  than  k. 

We  will  examine  both  CUSUMs  and  Shewhart  charts  in  the  body  of  this  thesis. 


1.2  Inverse  Gaussian  processes 

The  inverse  gaussian  distribution  has  its  genesis  in  the  analysis  of  Brownian  motion. 
Following  Chhikara  and  Folks  [1989],  we  characterize  the  Wiener  process  X  (t)  with 
drift  ly  and  variance  as  follows: 

1.  X(t)  has  independent  increments;  for  ti  <  t2  <  tz  <  t4,  we  have  X{t2)  —  X{ti) 
independent  of  X  (t^)  —  X  (ts) 

2.  x{t2)  -X(ti)  is  normally  distributed  with  mean  i/{t2  -  h)  and  variance  a^{t2  - 
ti)y  with  t2  >  ti- 

Schroedinger  [1915]  first  showed  that  the  distribution  of  the  first  time  until 
the  process  X  (t)  >  a  for  j/  >  0,  a  >  0  was  inverse  gaussian.  See  Figure  1.7  for  an 
illustration. 

This  characterization  of  the  inverse  gaussian  is  useful  for  the  applications  which 
follow  in  this  thesis.  Many  processes  can  be  well  modeled  by  Brownian  motion  with 
drift.  We  shall  see  how  the  attrition  of  forces  in  a  military  model  can  be  approximately 
modeled  by  the  inverse  gaussian  distribution.  We  shall  also  examine  how  well  the  time 
to  complete  the  work  at  a  station  on  an  automobile  assembly  line  is  modeled  by  an 
inverse  gaussian  distribution. 

The  inverse  gaussian  distribution  is  also  useful  for  modeling  of  positive,  skewed 
processes  in  general,  even  when  the  underlying  mechanics  do  not  immediately  suggest 
a  theoretical  basis  for  Brownian  motion  passage  times. 

The  reader  may  notice  a  striking  similarity  between  Figures  1.4  and  1.7.  Wald 
[1944,  1945,  1947]  showed  that  the  distribution  of  the  stopping  times  in  the  SPRT 
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with  lower  limit  6  =  —  oo  and  upper  limit  a  was  approximately  distributed  as  inverse 
gaussian.  Since  the  stopping  times  for  the  SPRT  are  discrete  random  variables  and 
the  inverse  gaussian  is  a  continuous  random  variable,  the  two  distributions  can  only 
be  approximately  equal. 

1.2.1  Well  Known  Properties 

In  this  section,  we  list  several  well  known  results  about  the  inverse  gaussian  distribu¬ 
tion,  which  later  prove  useful. 

1.2.1. 1  PDF 

The  probability  density  function  (pdf)  for  X  ~  A)  is 

=  r!:>0  (1.12) 

Several  density  curves  for  various  values  of  /x  and  A  can  be  seen  in  Figures  1.8 
and  1.9. 

1.2.1.2  CDF 

The  cumulative  distribution  function  for  X  7G(/x,  A)  is 


A)  =  3> 


(1.13) 


where  3>(x)  is  the  CDF  of  the  standard  normal  distribution.  [Chhikara  and 
Folks,  1989] 


£Ck) 
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Figure  1.9:  A  sheaf  of  IG(5,  X)  densities  for  A  =  1,  2, 5, 10,  and  25 


CHAPTER  1.  PROBLEM  STATEMENT 


21 


1.2.1. 3  First  passage  time  interpretation 

It  is  well  known  [Chhikara  and  Folks,  1989]  that  for  a  Wiener  process  with  positive 
drift  >  0)  the  first  passage  time  to  the  barrier  is  inverse  gaussian  with 

(a—Xo)  .  _  j  \  (a— ajo)^ 

A*  =  and  A  =  ^-^5— . 

1. 2.1.4  Characteristic  function  and  moments 


The  characteristic  function  for  X  IG{n,  A),  4'(t)  =  E  exp{iXt),  is 


'^(t)  =  exp  I  — 


1  - 


3 

Prom  this,  it  follows  that  the  mean  of  X  is  /u  and  the  variance  of  X  is 


1.2.1. 5  Member  of  the  exponential  family 

The  inverse  gaussian  distribution  is  knov/n  to  belong  to  the  exponential  family  of 
order  two.  Let  9  =  A.  Then  the  pdf  can  be  expressed  as 

/(a;;A,0)  =  exp  (A0/2)  3;-^/^exp  ^  (1.14) 

which  is  of  the  natural  exponential  family 

c(a;)d(9)exp(a(.'i:)  •  b{Q)) 

with  6(0)  =  0  =  (A,  9)  and  a(a:)  =  ~l/2{x  a:). 

Accordingly,  for  a  random  sample  X  from  the  inverse  gaussian,  the  two  di¬ 
mensional  statistic  (EX,EX~^)  is  complete  and  minimal  sufiicient. 


1.2.1. 6  MLEs  of  parameters  and  their  distribution 


For  a  random  sample  Xi,  Xq,  ■  ■  ■ ,  Xn  where  Xi  IG{ix,  A),  the  likelihood  function  is 

nl2 

L{ti,  A)  =  ' 


2n 


TT  -3/2\  ( 

'  exp  -a;^ 


7 
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and  the  maximum  likelihood  estimators  of  fz  and  A  are  easily  seen  to  be 


fi  =  X 


and 


These  estimators  were  obtained  by  Schroedinger  [1915]. 

The  distribution  of  these  estimators  is  also  known.  Tweedie  [Tweedie,  1957a 
and  1957b]  proved  the  following  results. 

Let  Xi,  X2,  ■■■,  Xn  be  independent  identically  distributed  as  inverse  gaussian 
with  finite  first  and  second  moments.  Then  Y^Xi  and  ^  —  n‘^{J2Xi)  ^  are  inde¬ 

pendently  distributed. 

Tweedie  showed  in  his  proof  that  X  IG(fL,nX).  He  defined 


T  = 


(1.15) 


and  also  proved 

-  xl-i 


1.2.1. 7  Related  Distributions 
Let 

...  _{X- 

Then 

AY^  ~  xl 

that  is,  AY^  has  the  chi-square  distribution  with  one  degree  of  freedom.  [Chhikara 
and  Folks,  1989]. 
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Let  Xi,  X2, .  ■ . ,  Xnx  and  Yi,  >2;  •  ■  • ,  L'ny  be  independent  random  samples  from 
IGifi,  A).  Let 

nx  /  1  1  \ 

(1-16) 


and  let  Vy  be  similarly  defined.  Then  it  follows  from  its  form  as  the  ratio  of  two 
independent  random  variables  and  is  well  known  [Chhikara  and  Folks,  1989]  that 

(ny  -  1)V>  . .  (1,17) 


R  = 


(nx-l)Fy 

where  is  the  standard  F  distribution  with  nx  —  1  and  ny  —  1  degrees  of 

freedom. 


1.2. 1.8  Predictive  distribution  for  non-informative  prior 

To  obtain  natural  conjugate  priors,  it  is  advantageous  to  reparameterize  the  dis¬ 
tribution.  This  poses  no  logical  difficulty  in  the  predictive  framework,  where  the 
parameters  will  be  integrated  out  in  the  process.  The  parameterization  we  use  sets 
Ip  =  l/H,  and  is  due  to  Tweedie. 

Lacking  data,  one  would  often  prefer  a  non-informative  prior  distribution. 
Jeffrey’s  prior  sets  p(ip,  A)  oc  ^/|/(V.’,  A)j.  Unfortunately,  this  prior  produces  a  posterior 
which  is  not  a  proper  distribution  [Chhikara  and  Folks,  1989]. 

If  one  considers  instead  a  diffuse  prior  for  the  parameters,  the  predictive  distri¬ 
bution  for  the  next  observation  given  a  series  of  observations  from  the  inverse  gaussian 
is  known  and  due  to  Chhikara  and  Guttman  [1982].  The  prior  used  is 

p{‘ip,  A)  oc  A""^. 

Then  let  x  ~  {xi,  xq,  ■ .  ■ ,  x„}  be  n  independent  observations  from  A),  and  Y 

be  an  additional  observation  taken  independently  of  x.  For  y  >  0, 


h(yjx)  =  k 


xX 

1/2  p 

(nx  -1-  y)y‘\ 

(x  -  y)^X 


—n/2 


xy{nx  +  y) 


(1.18) 
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where 


S't,n  (  («  +  1)  y' 


^  (i.  ¥)  -v 


z(nx+y)  J 


(1.19) 


Sf  denotes  the  Student’s  t  cumulative  distribution  function  with  n  degrees  of  free¬ 


dom,  and 


z  =  nv  +  _  .  _  , — r 
xy\nx  -H  y) 

Note  that  the  expression  given  in  Equation  1.19  for  k  is  correct,  and  rectifies  an 
existing  unnoticed  error  in  both  the  Technometrics  article  by  Chhikara  and  Guttman 
[1982]  and  the  Chhikara  and  Folks  monograph  [1989]. 

As  an  example,  consider  the  following  5  observations  from  an  IG{y,  A)  distri¬ 
bution,  with  y  and  A  unknown;  {a:i  =  3,X2  =  4,  xz  =  6,  3:4  =  3.5,  X5  =  2.5}.  Then  the 
graph  of  the  predictive  density  for  the  next  observation  can  be  seen  in  Figure  1.10. 

Predictive  limits  for  Y  can  be  obtained  by  solving  the  appropriate  integral 
equation  using  /i(y|x).  To  find  the  lower  predictive  limit,  Z(x),  given  the  data  x,  for 
the  next  observation  with  probability  a/2,  we  solve  numerically  the  integral  equation: 


n{x  -  y)‘ 


/•/(x) 

/  h{y|x)  dy 

./n 


(1.20) 


and  similarly  for  the  upper  predictive  limit,  u(x). 

We  derive  the  more  general  joint  predictive  distribution  for  the  next  m  obser¬ 
vations,  given  the  first  n,  below.  We  also  derive  tighter  predictive  limits. 


1.2.2  Modeling  advantages  over  other  skewed  distributions 

There  are  three  major  advantages  to  using  the  inverse  gaussian  distribution  to  model 
skewed  data.  The  first  is  an  appeal  to  the  underlying  physical  properties  of  the  process 
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Figure  1.10:  An  example  of  the  predictive  density  for  the  next  observation  from  an 
IG{fi,  A),  for  five  previous  points:  {x\  —  Z,X2  =  4,  0:3  =  6,  0:4  =  3.5,  x^  =  2.5}. 
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being  modeled.  The  second  discusses  the  notion  of  failure  rates,  and  their  asymptotic 
behavior.  The  third  is  based  on  the  tractability  of  the  sampling  distribution  of  the 
inverse  gaussian.  We  discuss  each  of  the  three  in  turn. 

If  the  underlying  process  can  be  thought  of  as  a  Wiener  process,  then  the 
use  of  the  inverse  gaussian  seems  especially  appropriate.  The  time  to  failure  of  a 
complex  system  may  depend  on  the  accumulated  additive  effects  of  many  small  per¬ 
turbations.  Compare  this  with  the  log-normal  distribution,  whose  application  to 
modeling  depends  on  multiplicative  effects,  which  are  often  difficult  to  defend  from 
first  principles. 

Secondly,  consider  the  failure  rate,  r(t)  of  a  system  as  a  function  of  time.  We 


define 


f(t) 


where  r(t)  denotes  the  instantaneous  rate  of  failure  for  a  process  conditional  on  its 
having  lasted  a  certain  time.  For  a  Poisson  process  with  time  to  failure  modeled  by 
the  exponential  distribution  with  parameter  A,  r(t)  =  A;  a  constant  failure  rate. 

The  assumption  of  constant  failure  rate  is  rather  strong.  Some  applications  call 
for  a  monotonic  failure  rate:  some  with  an  increasing  failure  rate  (IFR,)',  others  for  a 
decrea.sing  failure  rate  (DFR).  For  these,  it  is  possible  to  use  the  Weibull  distribution, 
with  density  =  a/?-“exp(-(5:/^)'"),  (x,a,6  >  0).  Then  r(t)  = 

and  is  decreasing  for  ck  <  1  and  increasing  for  a  >  1. 

In  m.any  situations  which  are  characterized  by  a  “burn  in”  process,  it  seems 
appropriate  to  have  an  initially  increasing  then  decreasing  failure  rate  [Chhikara  and 
Folks,  1977]. 

An  initially  IFR  then  DFR  process  is  often  modeled  by  the  log-normal  distri¬ 


bution.  For  such  a  process. 


r(t)  ”  P)/^))- 
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This  failure  rate  is  non-monotonic;  increasing  then  decreasing  asymptotically  to  zero. 
For  many  reliability  situations,  this  asymptotic  failure  rate  of  zero  seems  illogical. 

An  alternate  model  uses  the  inverse  gaussian  process.  Its  failure  rate  is  given 


by: 


r{t)  = 


fit) 


(21^3) 

1/2 . 

1  exp  1 

1 

1 

to 

#1 

)  _  e2VM$  1 

(  A 
V  V  ^ 

(1  +  i)) 

(1.21) 

(1.22) 


This  failure  rate  is  also  non-monotonic,  initially  increasing  then  decreasing. 
However,  its  asymptotic  failure  rate  is  given  by 


A 

2^*2 


It  is  easily  shown  that  the  maximum  value  of  r{t)  occurs  at  the  value  t*  which  is  the 
solution  to  the  equation 


..  A  3  A 


and  the  maximum  failure  rate. is  r{t*). 

This  provides  a  strong  argument  for  using  the  inverse  gaussian  over  the  log¬ 
normal  distribution  to  model  lifetimes.  It  is  hard  to  conceive  of  physical  processes 
where  the  failure  rate  vrould  decrease  to  zero. 

The  third  argument  for  using  the  inverse  gaussian  is  that  the  sampling  distri¬ 
bution  of  the  MLEs  of  the  parameters  are  known  and  easy  to  work  with,  as  above. 
Using  the  inverse  gaussian  avoids  the  need  to  transform  the  data  prior  to  finding 
MLEs,  as  is  the  case  with  the  log-normal  distribution. 


1.3  Scope  of  this  thesis 

This  thesis  extends  and  corrects  the  work  of  Edgeman,  who  first  developed  Shewhart 
control  charts  for  the  inverse  gaussian  distribution.  We  develop  self-starting  and 
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predictive  Shewhart  charts,  and  discuss  their  features.  We  then  develop  CUSUM 
charts  for  the  inverse  gaussian  distribution,  along  with  several  variants,  and  discuss 
their  properties.  We  also  propose  and  develop  a  new  type  of  bi-variate  Shewhart 
chart,  and  apply  it  to  the  inverse  gaussian  case. 

We  apply  these  methods  in  two  settings.  First,  we  model  the  loss  exchange 
ratio  (LER)  for  a  military  mission  as  an  inverse  gaussian  random  variable.  The  LER 
is  the  ratio  of  enemy  casualties  to  friendly  casualties.  We  use  standard  military  mod¬ 
eling  packages  to  generate  this  data  using  high-resolution  simulation  —  simulation  in 
the  war-gaming,  not  statistical,  sense.  We  verify  that  the  inverse  gaussian  distribu¬ 
tion  provides  a  reasonable  fit.  ^Ve  then  show  how  statistical  process  control,  and  in 
particular  the  methods  of  this  thesis,  can  alert  the  decision  maker  to  a  shift  in  the 
distribution,  and  the  significance  of  such  an  alert.  We  explain  the  significance  of  this 
application. 

Second,  we  consider  an  example  from  the  literature  where  the  time  to  complete 
a  task  on  a  General  Motors  assembly  line  has  been  well  modeled  by  the  inverse 
gaussian  distribution.  M^e  apply  our  methods  to  that  case,  and  discuss  the  significance 
of  that  application. 


Chapter  2 

Shewhart  Control  Charts  for  IG 
Processes 


In  this  chapter,  we  discuss  Shewhart  Control  charts  for  processes  modeled  by  the 
inverse  gaussian  distribution. 


2.1  Edgeman’s  work 

Ed  gem  an  [1989]  proposed  control  charts  for  the  inverse  gaussian  distribution.  He  pro¬ 
posed  charting  transformations  of  the  sufficient  statistics  X  and  E,  discussed  above. 
The  transformations  were  originally  given  by  Chhikara  and  Folks  [1976]. 

Let  X  ~  A).  Assume  A  is  fixed  and  known.  Chhikara  and  Folks  [1976] 

showed  that  the  uniformly  most  powerful  unbiased  test  for  Ho  :  /w  =  against 
H^  :  /i  7^  Aio  is  of  the  form  X  >  ki  ox  X  <  k2,  since  the  IG  has  a  monotone  likelihood 
ratio,  and  is  in  the  one-parameter  exponential  family  when  A  is  known.  To  obtain  a 
pivotal,  they  proposed  the  transformation  to  the  statistic 


(Nxy/Hx-iio) 

The  cumulative  distribution  function  of  Y  is  given  by 


G{y)  ^  d>(y)  -f-  exp(2A/Ai)$ 


(2.1) 


where  $(y)  is  the  standard  normal  CDF.  It  follows  from  the  CDF  of  Y  that  |y|  has 
the  folded  normal  distribution.  The  rejection  region  for  the  test  is  given  by 


|T|  >  Zi_a/2 


29 
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where  zi-ot/2  is  the  o:  critical  value  from  the  standard  normal  distribution. 

Similarly,  when  A  is  unknown,  the  test  statistic  becomes 

w  =  (2.2) 

where  V  is  defined  as  in  Equation  1.15.  The  critical  region  is 

\W\>  ti_a/2 

where  ti-a/2  is  the  critical  value  from  the  t  distribution  with  n  - 1  degrees  of  freedom. 

The  tests  for  one-sided  alternative  hypotheses  are  not  quite  as  simple  in  form, 
but  exist  and  are  discussed  in  [Chhikara  and  Folks,  1989]. 

Edgeman  replaced  /io  with  fj,  and  constructed  confidence  intervals  for  /i  of  the 


1  +  Z^^^,2^JX|{N\)  max  ^0, 1  —  Zi-a/2\[^ t 


when  A  is  known  and  of  the  form 


1  -f-  tl- 


I — )  / 

max  (^0, 1  -  ti_a/2i 


I  XV 
N{N-1) 


when  A  is  unknown. 


The  right  hand  limits  contain  the  “max”  function  in  the  denominator  to  re¬ 
strict  >  0. 

Edgeman  then  assumed  that  there  was  historical  data  based  on  about  “M  = 
20  to  M  =  25  samples”.  Prom  that  data,  he  substituted  the  grand  average  of  all 
observations,  X  for  X ,  and  V  for  F,  to  obtain  lower  and  upper  control  limits  (LCL 
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and  UCL,  respectively)  for  the  process  centrality  when  A  is  known  and  unknown. 


(2.3) 

(2.4) 

(2.5) 

(2.6) 


The  center  line  for  the  chart  (CL)  for  the  mean  is  given  by  X .  The  subsequent 
sample  averages  are  plotted  against  these  limits,  and  the  process  is  signaled  as  “out- 
of-control”  for  process  centrality  if/ when  a  sample  average  exceeds  the  control  limits. 

We  shall  see  in  a  later  section  that  the  performance  of  these  charts  is  very 
poor.  In  checking  the  original  article,  we  find  no  performance  data  for  Edgeman’s 
scheme,  only  for  a  competitor. 

We  conjecture  that  Edgeman’s  original  article  contains  an  error.  We  find 
that  if  we  proceed  as  Edgeman  did,  but  substitute  the  historical  mean  for  fiQ  into 
Equation  2.1,  and  then  construct  upper  and  lower  bounds  on  X ,  we  obtain: 


(2.7) 

(2.8) 

(2.9) 
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Similarly,  if  A  is  unknown,  we  proceed  from  Equation  2.2  and  obtain: 


>\unknawn 


XVh 


XVh 


U C L Xy^fikfioion  4n(Af  ”1)  v  ■  / 

We  use  Vh  =  =  which  is  our  unbiased  estimator  for  (M  -  1)/A 

using  the  M  historical  data  points.  These  control  limits  perform  as  expected,  and 
should  be  used  in  lieu  of  Edgeman’s  original  ones.  Note  that  n  refers  to  the  size  of 
the  rational  subgroup. 

By  illustration,  if  one  assumes  an  /G(3,5)  distribution,  samples  of  size  5, 
a  =  .01,  and  uses  the  uncorrected  scheme  of  Equations  2.4  and  2.5,  one  obtains 
LCL  =  1.58538  and  UCL  =  27.8508.  In  control,  the  ARL  for  the  uncorrected 
scheme  is  24.000,  instead  of  the  desired  ARL  of  100.  Clearly,  something  is  amiss. 

If  we  use  our  corrected  scheme,  we  obtain  LCL  =  1.26308  and  U CL  —  7.12542, 
and  the  ARL  is  99.9891.  This  is  the  desired  performance. 

Edgeman  controlled  for  A  in  a  similar  fashion,  using  the  fact  that  the  distri¬ 
bution  of  y  ~  (l/A)yn-i-  obtained 

UCL  =  (2.12) 

(iV-1) 


4n(M  -  1) 

- - h  V  ^  tl-a/2,n-l 


(2.10) 


(2.11) 


UCL  = 


(2.12) 


Vxl- 


(N-l) 


(2.13) 

(2.14) 


We  will  see  in  our  example  in  Chapter  8  that  this  control  chart  scheme  be¬ 
haves  unexpectedly,  in  that  the  ARL  for  out-of-control  states  obtained  by  small  shifts 
towards  the  heavy  tail  is  actually  greater  than  the  ARL  when  in-control.  This  will 
support  our  argument  to  use  other  techniques  which  do  not  suffer  from  this  defect. 
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Edgeman  compared  his  uncorrected  charts  with  traditional  X  and  R  charts  for 
normal  processes,  and  found  that  the  normal  charts  did  not  perform  well.  As  noted 
above,  he  provided  no  evidence  that  his  uncorrected  design  performed  well,  either. 

Edgeman  also  noted  that  the  “UCLs  given  . . .  can  become  infinite.  In  the 
event  that  an  infinite  UCL  occurs,  it  may  be  desirable  to  construct  control  charts  for 
reciprocal  process  centrality.” 

2.2  Improvements 

Additional  improvements  to  Edgeman’s  work  will  focus  first  on  simplifying  the  work. 
Next,  we  extend  it  to  the  case  where  the  parameters  are  not  known  a  priori,  and 
we  design  a  “self-starting”  control  scheme  which  continually  adapts  to  the  data.  An 
alternative,  predictive  control  charts,  will  be  developed  in  the  next  chapter. 

2.3  Simplification 

The  transformation  of  the  sample  data  into  Y  and  W  statistics  was  done  to  obtain 
pivotal  quantities;  that  is,  quantities  whose  distribution  did  not  depend  on  the  param¬ 
eter.  While  this  is  useful  in  estimation,  it  is  less  so  in  significance  testing,  particularly 
where  we  assume  a  priori  that  we  know  the  distribution  of  the  process.  Addition¬ 
ally,  quality  control  methods  should  strive,  where  convenient,  for  simplicity  in  their 
implementation,  as  the  calculations  may  be  performed  manually  by  the  shop  worker. 

With  this  in  mind,  let  us  revisit  the  A  charts.  For  A  known,  we  know  that  the 
uniformly  most  powerful  (UMP)  unbiased  test  for  Hq  :  =  po  against  Hi  :  p  ^  po 

is  of  the  form.  X  >  ki  ox  X  <  k^-  In  control,  the  distribution  of  X  is  known  to 
be  IG{p,n  A).  Accordingly,  it  is  simpler  to  directly  determine  fci  and  k2  to  meet 
some  criteria.  Once  the  criteria  are  determined  (discussed  below)  it  is  much  easier 
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to  implement  the  control  charts.  Additionally,  the  problem  of  infinite  or  negative 
control  limits  is  also  avoided. 

ki  and  ^2  are  set  so  that  the  probabihty  that  the  process,  in  control,  exceeds 
them  is  O'  =  -At-  Traditionally,  they  have  been  set  symmetrically,  satisfying  P(X  > 

AJtiL 

iki)  =  a/2  and  P(X  <  k^)  =  c^/2. 

The  UMP  unbiased  test  sets  the  control  limits  so  that  f  {x\ fx,  X)dx  = 
IjARL  and  s  /(rr;^,  A)dx  =:  E{X).  Economic  arguments  based  on  non-uniform 

loss  might  mitigate  a  different  strategy. 

The  symmetric  probability  and  UMP-unbiased  control  limits  are  not  the  tight¬ 
est  possible  limits  for  a  skewed  distribution.  For  a  skewed  distribution,  the  highest 
probability  density  region  (HPD)  provides  the  shortest  interval.  See  Figure  2.1. 

The  HPD  is  of  the  form  R{a)  =  {x  :  f{x)  >  fc(a)}.  k{a)  is  found  numerically, 
where  a  is  the  desired  probability  of  an  out-of-control  signal  when  the  process  is 
in-control. 

A  routine  for  computing  the  HPD  control  limits  for  the  inverse  gaussian  is 
available  from  the  author. 

In  summary,  we  chart  A  to  control  for  central  tendency.  Since  the  distribu¬ 
tion  of  A  is  knov/n  to  be  JG(/r,nA),  we  set  the  control  limits  (R)  by  finding  either 
symmetric  limits,  UMP-unbiased  limits,  or  the  HPD  region,  R{a)  for  the  fj^{x).  The 
process  is  deemed  to  be  in  control  as  long  as  the  sample  average  is  within  R. 

This  approach  has  two  distinct  advantages  over  Edgeman’s  work.  First,  we 
avoid  the  possibility  of  infinite  upper  or  lower  control  limits.  Second,  with  the  HPD 
region  we  obtain  the  tightest  (in  the  sense  of  shortest  possible)  control  intervals  among 
all  intervals  with  the  same  a.  We  compare  the  performance  of  all  these  methods  later 


in  the  chapter. 
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Figure  2.1:  Illustration  of  the  HPD  region  for  a  skewed  distribution.  Here  a  =  .01, 
and  X  ~  /G(3, 25),  corresponding  to  the  distribution  of  the  mean  of  a  sample  of  size 
five  from  an  /G(3,25).  The  upper  and  lower  limits  of  the  HPD  region  were  found 
numerically  using  Mathematica  to  be  a:  =  1.04805  and  x  =  6.26794.  The  value  of 
f(x)  at  the  endpoints  of  the  HPD  is  /  =  .0119252. 
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2.3.1  Average  Run  Lengths 

We  illustrate  the  comparative  advantages  of  the  known  parameter  approaches  above. 
We  examine  the  ARLs  found  for  several  /G(^,  A)  distributions,  in-control  and  then 
with  departures  from  control  at  various  points  in  the  process  lifetime.  The  exact 
value  of  the  out-of-control  ARLs  can  be  found  by  integration  for  all  of  the  known 
paxameter  approaches. 

It  is  apparent  from  Tables  2.3.1  and  2.3.1  that,  among  the  four  alternatives, 
the  HPD  test  is  the  most  powerful  for  detecting  an  increase  in  the  mean,  but  performs 
poorly  for  detecting  decreases.  For  a  decrease  in  the  mean,  the  ARL  exceeds  100  for 
some  of  the  test  cases. 

The  symmetric  hmits  appear  to  perform  reasonably  well  for  detecting  both 
decreases  and  increases. 

The  corrected  Edgeman  test  is  much  slower  detecting  an  upward  shift,  com¬ 
pared  to  the  symmetric  and  HPD  tests.  However,  it  does  not  appear  to  suffer  loss  of 
performance  for  downward  shifts.  Of  course,  we  expect  good  performance  from  the 
corrected  Edgeman  test,  as  it  is  based  on  the  uniform  most  powerful  unbiased  test 
for  detecting  a  mean  shift,  given  A  is  unknown. 

We  note  that  in  many  applications  (and  both  of  the  examples  which  follow 
later  in  this  work),  one  is  more  interested  in  detecting  increases  in  jj,  than  in  detecting 
decreases.  This  favors  the  HPD  chart  over  the  corrected  Edgeman  chart. 


2.4  Self-starting  Shewhart  Charts 

In  this  section,  we  propose  charts  based  on  the  running  estimates  for  //  and  A.  Fol¬ 
lowing  Hawkins  [1987],  we  call  these  ^^self-starting  charts. 

In  the  preceding  section,  we  assumed  the  process  characteristics  were  known 
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Comparing  AlTLs  for  Mean  Tests 

A 

Edgeman 

Corrected  Edgeman 

TiPD 

Symmetric 

"3" 

“5“ 

— 2TDD 

99:98 

100“ 

IDO 

4 

5 

81.91 

20.79 

11.18 

15.18 

5 

5 

191.94 

6.56 

4.31 

5.30 

6 

5 

344.117 

3.62 

2.69 

3.11 

7 

5 

378.658 

2.58 

2.06 

2.30 

1 

5 

1.01 

1.12 

1.58 

1.17 

Table  2.1:  Comparison  of  the  Edgeman,  Symmetric,  and  HPD  schemes  for  a  mean 
shift,  A  known,  a  =  .01,  a  =  5,  with  the  in-control  distribution  as  IG{Z,  5).  Note  the 
poor  performance  of  the  uncorrected  Edgeman  scheme.  Also  note  that  the  HPD  is 
not  as  quick  to  detect  a  downward  mean  shift. 


Comparing  AltLs  tor  Mean  Tests 

A 

Edgeman’s 

Corrected  Edgeman 

HPD 

Symmetric 

-TT 

66 

23T7 - 

100 

100 

IDO 

45 

66 

31.00 

86.75 

53.8326 

69.87 

50 

66 

47.83 

46.86 

23.4519 

33.17 

60 

66 

97.08 

14.49 

8.1826 

10.81 

70 

66 

168.04 

6.96 

4.5026 

5.56 

40 

66 

18.82 

93.80 

154.278 

109.93 

35 

66 

10.61 

53.79 

267.764 

75.22 

Table  2.2;  Comparison  of  the  Edgeman,  Symmetric,  and  HPD  schemes  for  a  mean 
shift,  A  known,  a  =  .01,  n  =  5,  with  the  in-control  distribution  as  7G(42,  66).  This 
distribution  is  discussed  later  in  the  General  Motors  example.  Again,  note  the  poor 
performance  of  the  uncorrected  Edgeman  scheme  and  the  one-sided  performance  of 
the  HPD  test. 
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exactly.  Tins,  of  course,  is  not  alwa^/s  the  case,  Edgernan  addressed  this  issue  by 
setting  ^  =  X  and  A  -  l/V.  These  are  only  estimates,  and  subject  to  sampling 
error.  If  there  is  extensive  historical  data,  then  the  error  will  be  small.  However, 
Edgernan  states 

The  values  of  X  and  V  should  be  based  on  the  results  of  about  M  —  20 
to  M  =  25  samples 


citing  Montgomery  [1985].  A  check  of  the  revised  referenced  work  indicates  that,  in 
the  section  discussing  the  sta,ti3tical  basis  of  the  cha.rt  for  the  normal  distribution, 
Montgomery  [1991,  p.  203]  states 

These  estimates  should  usually  be  based  on  at  least  20  to  25  samples. 

(Emphasis  added.) 

The  use  of  the  rule  of  thumb  of  “about”  20  to  25  historical  samples  for  setting  con¬ 
trol  limits  for  a  non-normal,  potentially  heavy  tailed  distribution  such  as  the  inverse 
gaussian  does  not  seem  siipported  by  this  reference.  For  CUSUM  charts,  a  different 
context,  Hawkins  [1987]  has  found  evidence  for  normal  processes  that  “25  start-up 
observations  (as  seems  to  be  the  standard  practicel  is  too  short  a  learning  set,  par¬ 
ticularly  as  regards  the  process  standard  deviation.” 

Additionally,  the  use  of  a  fixed  learning  set  to  set  process  limits  ignores  the 
improved  precision  one  may  get  iTO.m  incorporating  subsequent  data  into  the  estimates 
of  the  process  parameteis.  Further,  such  a  rule  is  of  no  help  in  the  start-up  phase 
of  the  process,  when  we  wish  to  control  the  process  while  also  gathering  the  process 
initial  data. 

To  a,dd.ress  these  issues,  we  may  use  self-starting  control  charts.  Hawkins 
[1987]  has  proposed  sell-start, ing  (JUSUM  control  charts.  He  argues  for  them  as 


foliov/s: 
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Thus  the  self-starting  cusums  produce  superior  performance  to  those 
obtained  with  some  25  special  start-up  values  while  not  involving  any 
measurements  beyond  those  produced  anyway.  . . .  We  would  therefore 
argue  that  self-start  CUSUMs  should  always  be  used  in  preference  to 
plugging  the  mean  and  observations  of  a  start-up  sample  into  the  formulae 
assuming  known  parameters. 

We  argue  similarly  for  self-starting  Shewhart  charts,  while  noting  that  the  Shewhart- 
type  chart  is  more  robust  to  a  model  mis-specification  of  parameter  than  the  CUSUM. 
Hence  the  case  for  the  self-starting  CUSUM  may  be  more  compelling  than  that  for 
the  self-starting  Shewhart  chart. 

We  design  two  self-starting  Shewhart  charts,  one  for  location  and  one  for  shape. 
The  self-starting  Shewhart  chart  is  based  on  the  UMP-unbiased  test  [Chhikara,  1975] 
for  the  equality  of  two  inverse  gaussian  population  means.  The  self-starting  Shewhart 
chart  for  shape  is  based  on  the  likelihood  ratio  test  for  the  equality  of  the  scale 
parameters  A  and  r  of  two  inverse  gaussian  populations  A)  and  IG{v,t).  These 

are  discussed  in  detail  below. 

2.4.1  Self-Starting  Shewhart  charts  for  location  for  the  / G{^i,  A) 

Assume  we  have  two  inverse  gaussian  processes,  X  and  Y ,  with  common  but  un¬ 
known  scale  parameter  A.  Chhikara  [1975]  derived  the  uniformly  most  powerful  un¬ 
biased  (UMP-unbiased)  test  for  the  equality  of  the  two  process  means.  We  test  the 
hypothesis 

Hq:  11  =  V  versus  Hi  :  n  ^  u 

We  draw  a  sample  from  each  population.  The  rejection  region  is  given  by 
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\T\ 


\Jnin2{ni  +  ^2  —  2)  (X  —  Y) 


^  ^  1— a/2,niH-7i2~-2 


(2.15) 


\T\  has  the  folded  t  distribution  with  m  +  n2  -  2  degrees  of  freedom,  a  result 
reported  in  Chhikara  and  Folks  [1989].  Note  that  the  expression  in  Equation  6.31  of 
Chhikara  and  Folks  [1989]  is  incorrect;  the  expression  in  Equation  2.15  is  correct. 

To  construct  a  self- starting  scheme,  we  collect  a  first  sample  of  size  n,  call 
it  Xi.  We  then  collect  a  second  sample  of  size  n,  Yi,  compute  Ti  and  test  for  the 
equality  of  means.  If  the  null  hypothesis  is  not  rejected,  we  merge  sample  Xi  and  Yi 
into  the  new  reference  sample,  X2,  of  size  2n.  We  then  collect  our  third  sample  of 
size  n,  call  it  Y2,  compute  r2  and  test  for  the  equality  of  means  for  X2  and  Y2.  If  it 
is  not  rejected,  we  merge  Y2  with  X2,  and  draw  another  sample.  And  so  forth. 

The  scheme  suffers  from  the  disadvantage  that  the  control  limits  depend  on  a 
t  statistic,  whose  value  changes  as  the  sample  sizes  change.  This  problem  is  reduced 
as  the  number  of  samples  grows,  because  t  ^  N{0,1)- 

There  is  a  further  transformation  of  the  test  statistic  that  can  allow  for  con¬ 
stant  control  limits. 


For  a  given  observation  of  T,  we  compute  its  p-value  using  the  CDF  for  T: 
P  =  Ft{T).  It  is  well  known  that  P  ~  17(0,1).  We  then  convert  that  percentile 
to  a  iV(0, 1)  variate,  using  the  inverse  CDF  for  the  standard  normal  distribution: 
Z  =  $”^(P).  We  are  now  in  the  very  familiar  setting  of  charting  standard  normal 
variates, 

Z  =  4)-i(FT(r))~iV(0, 1) 


While  these  transformations  complicate  the  computations  for  the  operator 
doing  his  charts  by  hand,  for  those  using  a  computer  this  scheme  has  the  advantage 
of  constant  limits,  a  distribution  unaffected  by  the  number  of  samples  to  date,  and  a 
very  familiar  context.  This  allows  simpler  charting  and  easier  probability  calculations. 
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We  obtain  control  limits  in  the  usual  manner,  first  specifying  the  ARL  we 
wish.  We  then  set  a  =  and  find  Za/2  and  zi.a/2,  the  usual  critical  values.  Since 
in-control  we  are  charting  normal  variates,  all  standard  control  chart  theory  applies. 

If  the  mean  of  the  process  shifts,  we  see  that  the  numerator  of  T  no  longer  has 
expected  value  of  0.  Accordingly,  the  ARL  out-of-control  is  reduced.  Calculation 
of  the  exact  out-of-control  ARL  is  cumbersome  due  to  the  mixed  in-control  and 
out-of-control  distributions  in  the  denominator  of  T  and  is  further  complicated  by 
the  transformations.  For  approximating  those  out-of-control  ARL  calculations,  we 
suggest  simulation.  Simulation  also  provides  a  characterization  of  the  out-of-control 
run  length  distribution  itself. 

As  an  alternative,  one  could  forego  the  above  transformations  and  chart  the 
corresponding  p-values  for  the  Ti,  and  signal  if  the  p-values  fell  in  some  rejection 
region.  As  the  p- values  are  uniformly  distributed  while  the  process  is  in-control,  it  is 
easy  to  derive  a  control  scheme  for  them.  We  prefer  to  chart  the  corresponding  normal 
variates,  because  they  provide  higher  visual  resolution  in  the  tails  of  the  distribution. 

2.4.1. 1  Example 

We  provide  the  following  example  of  a  self-starting  chart  for  the  process  mean.  The 
in-control  distribution  is  /G(3,5).  We  take  samples  of  size  one  by  simulation.  We 
desire  an  ARL  of  100,  so  we  signal  if  l^j  >  2.5758  or  if  .005  <  p  <  .995. 

Our  first  case  has  an  out-of-control  point  deliberately  inserted  at  observation 
6.  Table  2.3  provides  the  data.  The  control  chart  is  presented  in  Figure  2.2.  We  see 
that  the  scheme  does  not  catch  an  early  outlier,  which  is  to  be  expected.  We  also  see 
that  the  next  several  observations  report  low  p-values  and  ^-scores,  which  is  also  to 
be  expected  until  the  effect  of  the  outlier  is  averaged  out. 

Our  second  case  has  the  outlier  at  observation  16.  Table  2.4  provides  the  data. 
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n 

A 

— X 

P-value 

Signal? 

1 

3 

25 

3.5156 

n/a 

2 

1.4671 

n/a 

3 

2.2176 

.4520 

4 

2.0622 

.3965 

5 

2.7420 

.6437 

6 

7 

40 

13.0000 

.9916 

NO 

7 

3 

25 

2.2903 

.2662 

8 

2.6414 

.3282 

9 

2.6564 

.3343 

10 

5.2536 

.6860 

11 

3.1756 

.4041 

12 

5.2532 

.6888 

13 

1.9516 

.1504 

14 

3.7829 

.5119 

15 

2.7475 

.3141 

16 

2.7620 

.3208 

17 

3.6853 

.5170 

18 

2.8630 

.3408 

19 

2.2799 

.2044 

20 

2.3533 

.2279 

Table  2.3:  Self-starting  results  for  a  data  stream  with  early  outlier,  a  =  .01. 

The  control  chart  is  presented  in  Figure  2.3.  We  see  that  the  scheme  does  catch  this 
outlier  when  it  occurs  later  in  the  process.  We  also  see  the  effect  on  the  observations 
following  the  outlier  is  less  dramatic. 

Self-starting  charts  offer  us  the  opportunity  to  detect  outliers  in  our  initial 
phase.  This  provides  control  not  available  otherwise.  Additionally,  once  we  correct 
the  conditions  which  caused  them,  we  can  delete  those  outliers  from  our  process 
historical  data,  obtaining  better  estimates  for  subsequent  control. 

We  return  to  this  data  set  when  we  analyze  the  performance  of  the  predictive 

charts  developed  in  the  next  chapter. 

2.4.2  Self-Starting  Control  charts  for  shape  for  the  A) 

Assume  we  have  two  inverse  gaussian  processes,  X  and  Y ,  with  common  but  un¬ 
known  mean,  fx.  Chhikara  [1975]  derived  the  uniform  most  powerful  unbiased  (UMP- 
unbiased)  test  for  the  equahty  of  the  two  process  shape  parameters,  A.  We  test  the 
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the  self-starting  example  with  later  outlier,  a  =  .01. 
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n 

A 

- X - 

P-value 

Signal? 

1 

3 

25 

3.5156 

n/a 

2 

1.4671 

n/a 

3 

2.2176 

.4520 

4 

2.0622 

.3965 

5 

2.7420 

.6437 

6 

2.7620 

.6385 

7 

2.2903 

.4179 

8 

2.6414 

.6013 

9 

2.6564 

.6033 

10 

5.2536 

.9882 

11 

3.1756 

.6451 

12 

5.2532 

.9472 

13 

1.9516 

.1427 

14 

3.7829 

.7382 

15 

2.7475 

.4165 

16 

7 

40 

13.0000 

.9993 

YES 

17 

3.6853 

.5170 

18 

2.8630 

.3408 

19 

2.2799 

.2044 

20 

2.3533 

.2279 

Table  2.4:  Self-starting  results  for  a  data  stream  with  later  outlier,  cl  —  .01. 
hypothesis: 

Ho‘^x  =  V  versus  Ha'- 

Again,  we  draw  a  sample  from  each  population.  Let 


and  Vy  be  similarly  defined.  Then  we  recall  from  Equation  1.17  that  the  distribution 
under  the  null  hypothesis  of 

(nr-l)Vx 
(nx  -  l)Vy 

is  known  to  be  Fnx-i,nY-i- 

We  can  use  this  known  distributional  result  to  construct  a  simple  self-starting 
Shewhart  scheme  to  control  for  the  shape  parameter. 

We  set  our  significance  level,  a,  a  priori.  We  then  draw  our  first  sample  from 
our  process,  and  call  it  Yi.  We  draw  our  second  sample,  and  call  it  X.  We  compute 
R  for  the  two  samples,  X  and  Yy  We  find  the  corresponding  two-sided  p-value  from 
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a  standard  F  table  or  numerical  routine  with  the  appropriate  degrees  of  freedom,  and 
chart  it.  If  p  ^  [q:/2,  1  -  a/2],  we  signal.  Else,  we  define  Y2  =  Yi  +  X ,  and  draw  a 
new  sample  ,  X . 

In  control,  our  ARL  is  1/a. 

Now,  assume  that  the  distribution  shifts  in  the  shape  parameter  to  Aq.  For 
the  first  observation  out-of-control,  we  have 


X  ~  Xa) 


Then  the  distribution  of  Vx  shifts  as  well: 


Fx 


Xnx— 1 


Then 


fi  =  =  A  X  ii„ 


A(nx  —  l)Fy  Ao 

where  Ra  has  the  F  distribution  with  the  appropriate  degrees  of  freedom.  In  other 
words,  there  is  a  scale  shift  in  R.  For  this  first  observation,  we  can  compute  explicitly 
the  probability  of  a  signal: 


^  X  Rq^  ^  lower)  ^upper\ J  ^  ^ 


lower 


upper 


A  ’  A 

For  subsequent  observations  out-of-control,  the  distribution  of  R  is  not  as 
clean,  due  to  the  mixing  of  differently  scaled  variables  in  the  expression  for  Y^. 


2.5  Conclusion 

In  this  chapter,  we  have  improved  and  extended  the  work  by  Edgeman,  who  first 
described  Shewhart  charts  for  the  inverse  gaussian  distribution.  We  have  corrected 
the  test  for  the  mean  of  the  process.  We  have  explored  both  symmetric  and  HPD 
control  limits,  and  contrasted  them  to  the  corrected  Edgeman  scheme.  We  have  also 
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described  self-starting  Shewhart  charts  to  allow  both  process  control  in  the  startup 
phase  and  improved  control  during  the  run  phase.  The  examples  in  the  preceding 
section  illustrated  the  comparative  utility  of  these  approaches. 


Chapter  3 

Predictive  control  charts  for  the  IG 


Instead  of  using  the  self-starting  methodology  of  the  previous  chapter,  it  is  possible 
to  consider  the  problem  anew  from  a  Bayesian  perspective.  This  offers  the  possibility 
of  incorporating  pre-existing  information  about  the  process  into  our  control  schemes, 
while  simultaneously  recognizing  that  there  is  uncertainty  associated  with  the  pre¬ 
existing  information. 

We  consider  two  classifications  of  schemes  in  this  chapter:  rational  groups  of 
size  one  and  then  larger  rational  groups.  We  derive  the  predictive  limits  for  the  next 
observation  or  next  group  of  observations. 

The  predictive  framework  has  the  advantage  of  dealing  with  observables.  While 
traditional  quality  control  methods  control  for  the  value  of  a  parameter,  the  predic¬ 
tive  approach  controls  more  generally  for  model  departures.  If  the  model  is  correct, 
and  unchanged,  we  may  obtain  an  interval  with  a  given  probability  for  the  next  ob¬ 
servation.  If  the  next  observation  is  within  that  interval,  we  update  and  continue  to 
sample.  If  the  next  observation  is  outside  that  interval,  we  stop  the  process  and  check 
for  model  departures  in  the  underlying  physical  process. 

Since  we  also  obtain  the  distribution  of  the  next  observable,  we  are  able  to 
make  expected  loss  calculations  for  any  loss  function.  We  can  then  control  for  ex¬ 
pected  loss  as  a  function  of  the  next  observable,  instead  of  controlling  for  the  value  of 
the  observable,  or  even  for  a  parameter  shift.  The  abihty  to  introduce  loss  functions 
in  this  manner  also  argues  strongly  for  the  predictive  approach. 
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3.1  Sample  of  size  one 

In  this  section,  we  utilize  Equations  1.18  and  1.19  to  construct  a  control  scheme  for 
the  next  observation  from  an  inverse  gaussian  process.  In  this  approach,  we  assumed 
an  appropriate  non-informative  prior  distribution,  as  was  used  in  Chapter  1  where 
these  results  were  reviewed. 

Instead  of  a  non-informative  prior  distribution,  it  is  possible  to  use  the  natural 
conjugate  prior  for  the  inverse  gaussian  as  the  prior.  Those  methods  are  not  developed 
further  in  this  thesis.  Such  an  approach  would  have  the  advantage  of  greater  predictive 
power  early  in  the  process  startup  phase,  reflecting  a  more  informed  prior  opinion. 

Recall  that  the  predictive  distribution  for  the  next  observation  Y  from  an 


inverse  gaussian  process  with  previous  observations  Xi,  X2,  •  ■  •  >  Xn  is  given  by: 


h(ylx)  ==  k 


(nx  +  y)y^ 


(x  -  y)^X 
xy{nx  +  y) 


where 


St,n  denotes  the  Student’s  t  distribution  with  n  degrees  of  freedom,  x  is  the  arithmetic 
mean,  and  v  =  Z)(l/a:i)  —  1/x,  and 


2 


nv  + 


n{x  -  yY 
xy{nx  +  y) 


As  in  the  previous  chapter,  there  are  two  approaches  to  flnding  limits  for 


the  next  observation.  The  first  uses  symmetric  quantiles,  and  requires  us  to  compute 


quantiles  for  predictive  distribution.  A  numerical  routine  which  computes  these  quan¬ 
tiles  is  available  from  the  author.  The  second  approach  uses  the  highest  probability 


density  region.  It,  too,  has  a  numerical  routine  available  from  the  author. 
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Prom  each  of  these  schemes  (controlling  with  probability  limits  and  controlling 
with  highest  probability  density  limits),  we  have  obtained  a  region  R  for  the  next 
observation  Y.lfY  e  R,  we  continue  to  sample.  HY  ^  R,  then  we  stop  the  process 
and  check  for  a  physical  reason  for  the  model  departure. 

For  the  first  approach,  notice  that  we  do  not  actually  have  to  compute  the 
region,  R,  which  is  an  extensive  exercise  in  numerical  computing.  Rather,  we  compute 
the  p  value  associated  with  the  observation,  and  compute  and  chart  it.  If  P(Y  •C  y) 
is  too  low  or  too  high,  we  stop  the  process  as  out-of-control. 

For  the  second  approach,  we  find  the  P{h{Y)  <  h{y)).  Again  we  chart  this 
probability.  If  this  probability  is  too  low  (one  sided  test  here),  we  stop  the  process  as 
out-of-control.  Computing  P{h{Y)  <  h{y))  is  easier  than  finding  the  corresponding 
rejection  region. 

Charting  p  values  has  the  additional  benefit  of  being  easy  to  explain  to  the 
practitioner.  We  are  charting  how  unlikely  the  subsequent  data  is,  given  all  of  the 
preceding  observations. 

3.1.1  An  example 

Let  us  assume  that  we  have  a  process  with  true  but  unknown  parameters  =  3,  and 
A  =  25.  Let  us  further  assume  that  we  have  an  out-of-control  state  which  has  =  7 
and  A  =  40.  We  will  run  the  process  in  control  for  various  lengths,  and  then  see  if 
the  chart  detects  the  departures.  We  will  set  a  =  .01,  resulting  in  an  ARL  of  100  in 
control.  Notice  the  scheme  does  not  know  what  the  initial  parameters  are.  Tables 
3.1.1  and  3.2  record  what  happens. 

In  constructing  the  samples,  we  generated  19  observations  from  an  IG{3,  25), 
and  one  outlier.  The  outlier  was  chosen  so  as  not  to  signal  early  in  the  process,  but 
to  signal  late  in  the  process.  We  see  that  the  vagueness  in  the  prior  results  in  a 
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n 

IJ' 

A 

— X — 

P-value 

SignalY 

1 

3 

25 

3.5156 

n/a 

2 

1.4671 

n/a 

3 

2.2176 

.4704 

4 

2.0622 

.4718 

5 

2.7420 

.7141 

6 

7 

40 

13.0000 

.9947 

NO 

7 

3 

25 

2.2903 

.3796 

8 

2.6414 

.4529 

9 

2.6564 

.4554 

10 

5.2536 

.7978 

11 

3.1756 

.5253 

12 

5.2532 

.7938 

13 

1.9516 

.2174 

14 

3.7829 

.6277 

15 

2.7475 

.4131 

16 

2.7620 

.4175 

17 

3.6853 

.6219 

18 

2.8630 

.4344 

19 

2.2799 

.2741 

20 

2.3533 

.3010 

Table  3.1:  Representative  predictive  results  for  early  outlier.  We  signal  if  p  ^ 
(.005,  .995). 


n 

A 

— X 

P-value 

Si^^nalY 

1 

3 

25 

3.5156 

n/a 

2 

1.4671 

n/a 

3 

2.2176 

.4704 

4 

2.0622 

.4718 

5 

2.7420 

.7141 

6 

2.7620 

.7048 

7 

2.2903 

.4816 

8 

2.6414 

.6578 

9 

2.6564 

.6557 

10 

5.2536 

.9926 

11 

3.1756 

.7099 

12 

5.2532 

.9657 

13 

1.9516 

.1851 

14 

3.7829 

.7988 

15 

2.7475 

.4881 

16 

7 

40 

13.0000 

.99973 

YES 

17 

3.6853 

.6219 

18 

2.8630 

.4344 

19 

2.2799 

.2741 

20 

2.3533 

.3010 

Table  3.2:  Representative  predictive  results  for  later  outlier.  We  signal  if  p  ^ 
(.005,  .995). 
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wide  posterior  distribution  for  the  next  observable  early  in  the  process.  That  is  why 
observation  6  in  Table  3.1.1  does  not  signal,  but  observation  16  in  Table  3.2  does. 

We  can  introduce  a  loss  function  into  this  problem.  Assume  that  there  is  a 
loss  function  L{Y)  associated  with  each  observation.  We  wish  to  predict  the  expected 
loss  for  the  next  observable,  based  on  the  current  data.  If  the  predicted  expected  loss 
exceeds  some  value,  we  stop  the  process.  Then 

fOO 

Ey\x{L{Y)\X)  =  /  L(y)h{y\x)  dy  (3-3) 

J  0 

A  simple  example  of  a  loss  function  would  arise  in  a  warranty  context.  Say 
Y  models  the  lifetime  of  the  next  component,  which  is  warranted  for  a  period  of  one 
time  unit.  If  y  <  1,  we  have  a  loss  of,  say,  1,  else  we  have  no  loss.  The  expected 
loss  is,  of  course,  the  same  as  finding  P{Y  <  llX),  where  Y  is  the  predicted  next 
observation. 

One  can  use  any  other  loss  function  in  a  similar  manner.  We  discuss  two 
asymmetric  loss  functions  which  seem  useful  later  in  this  chapter. 

This  Bayesian  approach  is  particularly  useful  when  one  is  indiflFerent  to  the 
model  parameters,  but  very  interested  in  controlling  some  loss.  For  example,  when 
modeling  skewed  data,  the  practitioner  might  not  care  if  the  model  used  was  a  log¬ 
normal  one  or  an  inverse  gaussian  one.  He  might  only  care  what  his  expected  loss 
was.  The  advantage  of  the  inverse  gaussian  model  is  that  it  allows  the  calculation  of 
this  expected  loss  much  more  easily  than  the  lognormal  model,  using  this  predictive 
approach  based  on  the  sample. 

3.1.2  Comparison  with  the  self-starting  scheme 

Recall  that  we  have  used  the  same  data  sets  as  examples  for  both  the  self-starting 
and  predictive  charts.  Comparing  Tables  2.3  and  2.4  with  Tables  3.1.1  and  3.2, 
we  see  that  the  predictive  method  almost  detected  the  early  outlier  (p  -  .9947)  and 
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was  more  sensitive  than  the  self-starting  technique  (where  p  =  .9916.)  The  predictive 
chart  also  signaled  more  strongly  than  the  self-starting  chart  for  the  later  outlier 
(p  =  .9997  for  the  predictive  chart  vs.  p  =  .9993  for  the  self-starting  scheme). 


3.2  Sample  of  size  m 


A  search  of  the  literature  does  not  reveal  the  predictive  distribution  for  the  next  m 
observations,  based  on  a  sample  of  size  n.  In  this  section,  we  derive  this  distribution, 
using  principles  summarized  by  Geisser  [1993].  We  parallel  the  approach  in  Chhikara 
and  Guttman,  [1982],  used  for  the  univariate  case. 

The  resulting  distribution  can  be  used  in  exactly  the  same  manner  as  the 
preceding  section  for  control  of  the  process  or  the  loss  from  a  process. 

For  ease  of  computation,  we  use  the  parameterization  of  the  density  in  the  form 
f{x\9^  A).  We  use  the  diflhise  prior  for  9  and  A  given  by  [Banerjee  and  Bhattacharyya, 
1979],  which  is 

p(0.  A)  oc  i 


Then,  the  likelihood  becomes 


L{9,  A|x)  oc  exp 


(■ 


n\v 


1  + 


{x9  -  1)=^ 


vx 


where  x  —  Y^Xi/n  and 

E(i/^i)  1 

n  X 

Note  that  we  have  slightly  redefined  v  in  this  section,  to  match  the  notation  of 
Chhikara  and  Folks. 

The  posterior  density  for  the  parameters  becomes 


/(^,A|x) 


(— 

^1/2 

f  nu  ] 

U  J 

^(n/2-1) 

r( 

n-l'\ 
2  ) 

St,n 

exp 


VX  )  j 


(3.4) 
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We  write  the  joint  density  of  the  future  m  observations,  y,  given  0,  A  as; 


/(y|^,A)  = 


--  X; 


1  l\  ^  m{9y  —  1)^ 

Vi  y)  y 


Then  the  predictive  density  becomes: 


oo  oo 

h(ylx)  =  j  J  fiy\0,  A)p(^,A|x)  d9dX 


The  simplification  of  h(ylx)  is  an  extensive  bit  of  work.  We  will  use  the 
following  constants  and  redefined  variables  to  reduce  the  notation.  We  will  introduce 


additional  notation  later. 


r  (V) 

mx  +  nyjvx  +  1)  +  w(x  y)  {m  +  n)^ 

^  ~  xy  my  +  nx 

Using  the  above  change  of  variables  and  constants,  we  can  write  Equation  3.6 


h(y\J^)  =  c  j  J  ^exp  -A 


2  +  my  +  nx  (9 


m+n 
my’bnx  ^ 


d9  dX  (3.10) 


Using  the  appropriate  Gamma  identity,  we  simplify  to: 


h(y|x)  -  cT 


m  +n' 


z  +  my  +  nx  (9 


m+n 

my-Hix 


2\  -(m+n)/2 


d9  (3.11) 


We  make  four  new  definitions: 


m  +  n 
my  +  nx 


Cl  =  c  r 


_ t^/z _ 

TTi  Tn  +  ly/nx  +  my 
P  / 2(r^-H»)/2 
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C2 


LL 


Cl 


^/m  +  n-  l^/nx  +  my 


-(m+n)  /(m  +  n  -  l){nx  +  my) 


my  +  nx  V 

Now,  we  change  variables  from  6  to  t.  The  integral  for  the  predictive  density 
becomes: 


OO 


/i(ylx)  =  Cl  /  2  + 


zP 


LL 


m  +  n  — 


t) 


-(m+n)/2 


dt 


(3.12) 


Factoring  out  the  z,  grouping,  and  appealing  to  the  symmetry  of  the  integrand, 
we  obtain: 

.2  \  -(?"+”) /2 


Hy\'^)  =  C2  /  1  + 


p 


LL 


m  +  n  —  1 


) 


dt 


LL 


-(m+n)/2 


-  C2 


1  + 


m  +  n  —  1^ 
=  C2  Stffi^n—li.  LL') 


dt 


-OO 


(3.13) 

(3.14) 

(3.15) 


Here,  as  before,  the  CDF  of  the  Student’s  t-distribution  with  k  degrees  of 
freedom  is  given  by  St,k. 

Now  that  the  parameters  have  been  successfully  integrated  out,  the  remaining 
task  is  to  recover  the  variables  of  interest  from  the  expression  and  simplify.  We  have: 


h{y\^) 


C2‘S't,mH-n— 1(  LL) 

^-(n,+n-i)/2r((^  +  n)/2)2(^+’^)/^ 

^^J{m  +  n  —  l)(ns  +  my)r((n  —  l)/2) 
St,n-i{s/{n-l)/2)  V2; 


o  (  (m-{-n)y/m-\-n-^ 


^  (nS  +  my)v  [1  (yf) )  ^  5t,„_i 


(3.16) 
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X  (nv) 


-n/2  \  2  /  ^-(m+n-l)/2 


r(Bfl)  7rW2' 


(3.17) 


jx{nv)^ 

n 

( 1  A 
1  2  ) 

\  V  r  1 

1  2  j 

1  7rW2  St,n-1 

o  \  (rrt+n)v^m+n-l  ^ 

^^(my+na)  ) 

m+n-l  3/2 
^  2  Wy.' 


(3.18) 


We  write 

n(vx  +  l)  ,  Tn  {m+nf 

z  - - h  w  +  - —  r 

X  y  my  +  nx 

Notice  that  Equation  3.18  reduces  to  three  summary  statistics  for  the  future 
observables;  w,y,  and  U.yi,  which  are  functions  of  the  harmonic,  arithmetic,  and 
geometric  means,  respectively.  Xhis  contrasts  with  inference  for  the  parameters, 
which  reduced  to  the  two  sufficient  statistics  (  which  were  the  arithmetic  and  harmonic 
means).  We  can  apply  Equation  3.18  to  our  control  charting  scenario  for  groups  of 
observations.  Say  we  are  sampling  from  a  process  which  is  in  its  start-up  phase.  We 
take  samples  of  size  m.  After  the  first  sample,  we  can  compute  a  predictive  density  for 
the  second  sample.  We  find  the  probability  of  obtaining  the  current  sample  given  the 
predictive  distribution  for  the  previous  sample(s).  If  that  probability  is  alarmingly 
low,  we  stop  the  process  and  investigate.  Else,  we  incorporate  the  current  sample 
into  our  “old”  data  and  recompute  a  new  predictive  distribution,  and  sample  again. 

Determining  the  acceptance  region  of  the  future  observables  is  best  accom¬ 
plished  by  use  of  the  highest  probability  density  approach,  which  results  in  a  one¬ 
dimensional  measure.  We  compute  the  value  of  the  predicted  density  for  the  observa¬ 
tions,  and  then  compute  P  (h(Ylx)  >  /i(ylx)|x)  using  either  Monte  Carlo  methods, 
or  an  indicator  function  and  a  numerical  integration  routine  over  m  space. 

In  practice,  this  approach  is  more  cumbersome  than  merely  using  the  predictive 
format  for  a  single  future  observable.  This  is  especially  true  when  attempting  to 
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compute  the  predictive  probability  of  the  next  set  of  observables.  It  does  allow  for 
fair  treatment  of  all  of  the  observations  in  the  next  sample,  avoiding  the  temptation  to 
order  favorably  or  unfavorably  the  observations  when  applying  the  single  observation 
prediction  tests. 

The  multiple  predictive  density  is  of  greater  utility  in  computing  expected 
losses  for  the  next  batch  of  the  process,  as  opposed  to  the  next  sample.  One  can 
adjust  the  dimension  of  the  future  observables  to  any  convenient  size,  and  work  with 
it.  It  might  be  that  the  appropriate  size  for  prediction  was,  say,  the  output  of  the  next 
shift  at  a  plant,  instead  of  predicting  for  merely  the  next  sample  or  the  quantities 
until  the  next  sample. 

3.3  Loss  functions 

There  are  two  current  loss  functions  widely  in  use  in  quality  control. 

The  first  is  an  indicator  function  for  the  region  where  the  process  is  out-of- 
specification,  L{y)  =  Iniy)-  It  is  equal  to  one  when  the  process  is  out-of-specification, 
and  zero  when  the  process  is  in-control.  This  is  the  implied  loss  function  for  many 
traditional  charting  schemes. 

The  second  function  in  use  is  quadratic  loss,  whose  best  known  proponent 
is  Taguchi.  Quadratic  loss  argues  for  continuous  improvement  to  reduce  variabil¬ 
ity,  as  any  deviation  from  the  target  value  for  the  process  characteristic  is  penal¬ 
ized.  Quadratic  loss  is  symmetric  with  respect  to  deviations  above  and  below  target. 
Quadratic  loss  is  also  very  tractable  to  work  with,  and  is  directly  related  to  the 
variance  of  a  process. 

However,  there  is  nothing  sacred  about  quadratic  loss  as  a  choice  of  loss  func¬ 
tion,  and  upon  reflection  one  can  see  that  it  does  have  modeling  weaknesses. 

It  is  a  large  assumption  that  one’s  losses  are  quadratic  in  form.  Most  are 
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asymmetric.  For  example,  consider  filling  a  bag  with  fiour.  The  loss  for  slightly  over¬ 
filling  the  bag  is  likely  much  less  than  the  loss  for  under-filling,  with  its  associated 
losses  of  good-will  and  regulatory  penalties.  Similarly,  consider  the  tensile  strength 
of  550  pound  parachute  cord.  The  loss  for  making  the  cord  too  strong  is  probably 
much  smaller  than  the  loss  for  making  the  cord  too  weak,  and  the  loss  for  making 
the  cord  too  weak  grows  (in  this  parachutist’s  opinion)  much  faster  than  quadratic 

growth. 

Loss  functions  ideally  would  come  from  process  understanding.  Many  of  the 
losses  for  being  out-of-control  are  intangible,  or  poorly  estimable.  That  lack  of  un¬ 
derstanding  of  the  loss  has  argued  for  a  quadratic  loss  function,  which  at  least  is  easy 
to  work  with.  However,  it  is  also  possible  to  work  with  at  least  two  other  types  of 
asymmetric  loss  functions. 


3.3.1  Lin-quad  loss 


This  loss  function  is  a  piecewise  differentiable  function  which  is  linear  on  one  side 
of  the  target  value  and  quadratic  on  the  other.  It  is  easy  to  derive  its  form.  Let 
the  quadratic  loss  be  specified  by  Li{y)  =  k{x  —  c)^,  where  c  is  the  target  value  and 
k  is  the  coefficient.  Let  the  linear  loss  be  specified  by  L2iy)  =  K®  “  ^  adjusts 

the  intercepts,  as  we  will  see  below.  To  obtain  differentiablity,  we  define  the  point 
X  =  b/{2k)  +  c  as  the  point  where  the  definition  changes.  We  have  then: 


_  I  ^  -  c)  for  s  <  c  -h  ^ 


2k 

for  ar  >  c  +  ^ 


,  /  -  x2  f _ .  "6  (3.19) 

k[x  —  c) 

Here  we  have  placed  the  linear  portion  on  the  left  side  (6  <  0).  A  similar  procedure 
works  for  the  linear  loss  on  the  right  hand  side.  This  loss  function  is  pictured  in 


Figure  3.1. 

For  the  multivariate  predictive  case,  one  way  we  can  obtain  our  multivariate 
loss  is  by  adding  the  marginal  expected  losses. 
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Figure  3.1:  Linear-quadratic  loss  function  for  target  value  0. 
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The  coefficients  c,  k  and  b  can  be  estimated  in  the  usual  way  from  historical 
loss  data  or  assumed  as  part  of  the  modeling  effort. 

3.3.2  LINEX  Loss 

Varian  [1975]  proposed,  Zellner  [1986]  further  investigated,  and  Geisser  [1993]  used 
an  asymmetric  loss  function  called  LINEX.  Varian  defined  A  =  0  —  ^  as  the  scalar 
estimation  error.  His  loss  function  was  then 

L(A)  =  6exp(aA)  —  cA  —  b 

This  loss  function  is  illustrated  in  Figure  3.2. 

This  loss  function  is  recommended  where  the  consequences  for  error  to  one 
side  of  the  target  value  are  catastrophic.  Geisser  [1993]  discusses  this  loss  function 
in  the  context  of  regulating  insulin  dosage  to  diabetics,  where  too  much  insulin  is 
catastrophic,  compared  to  too  little. 

We  can  use  a  loss  function  of  similar  form  for  our  predictive  control  problem, 
where  A  =  y  —  c,  and  c  is  our  target  value.  Again,  we  can  find  the  expected  loss  by 
integrating  against  the  predictive  density. 

3.4  Conclusions 

The  advantages  of  using  a  predictive  method  should  now  be  clear:  first,  we  obtain 
tighter  predictions;  second,  we  can  integrate  our  predictive  density  against  a  loss 
function  and  obtain  our  expected  loss.  We  can  base  our  decisions  on  this  value.  By 
further  incorporating  in  the  cost  of  stopping  the  process  to  investigate  a  signal,  we 
can  act  on  our  total  expected  loss.  This  offers  a  choice:  instead  of  acting  on  possible 
parameter  shifts  which  may  cost  more  to  investigate  than  to  tolerate,  we  can  decide 
on  economic  grounds. 


Linex  lo5s(x) 
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Figure  3.2:  Linear-exponential  loss  function  for  target  value  0. 


Chapter  4 

Cumulative  Sum  Charts  for  IG 
Processes 


In  this  chapter,  we  derive  a  CUSUM  scheme  for  processes  modeled  by  the  IG  distri¬ 
bution.  We  chart  for  location  and  shape,  and  then  examine  the  behavior  of  the  tests 
under  the  model,  and  under  various  model  departures. 


4.1  Scheme  construction 

Control  charts  for  normal  processes  chart  location  and  scale.  The  statistic  for  location 
is  usually  the  sample  mean;  the  one  for  scale  is  usually  the  sample  standard  deviation. 
For  the  IG  distribution,  those  statistics  (sample  mean,  sample  standard  deviation)  are 
not  independent.  To  use  them  to  control  an  IG  process,  one  would  need  a  bivariate 
chart  (discussed  below).  However,  the  control  limits  for  such  a  chart  would  seem 
artificial,  as  indicated  by  the  plot  of  X  vs.  S  for  an  /G  example  in  Fig  4.1,  below.  A 
better  scheme  for  joint  control  will  be  discussed  later. 

We  return  to  well-known  principles  to  derive  our  CUSUM  scheme.  Recall 
from  Chapter  1  that  the  inverse  gaussian  distribution  is  a  member  of  the  exponential 
family.  As  discussed  in  Hawkins  [1992c]  and  in  Chapter  1,  the  upward  CUSUM  for  a 
member  of  exponential  family  in  decision  interval  form  is  defined  by 

=  0  (4-1) 

S+  =  max{0,S+_,  +  Tn-k+)  (4-2) 
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d{0i)  -  djeo) 
b{e^)-b{9o)' 


(4.3) 


where  b{9)  and  d{6)  come  from  the  parameterization  of  the  exponential  family 
as  f{x)  =  exp{a{x)b{9)  +  c{x)  +  d{9))  and  Tn  =  a{Xn)  ■ 

Similarly,  the  downward  CUSUM  is  given  by: 


5o-  =  0  (4.4) 

S~  =  min(0,  S~_-^  +  Tn  +  k~)  (4.5) 

d{9i)  -  d{9o) 
b{9i)  —  b{9Q) 

We  assume  as  our  model  that  our  process  is  well  specified  as  X  ~  7G(/i,  A). 
For  two  parameters,  we  write  the  density  of  X  as 

<t>2)  =  {x,  1/x)  •  (<^i,  4>2)) 

where  <pi  =  ^  and  <^2  =  A. 

We  will  use  four  charts  simultaneously:  one  each  for  upward  and  downward 

departure  in  the  two  model  parameters. 

The  schemes  derived  below  are  for  individual  observations,  or  samples  of  size 
one.  This  has  the  advantage  of  maximum  flexibility:  one  can  always  chart  a  larger 
sample  as  a  group  of  individual  observations.  While  we  do  not  do  so  here,  it  is  possible 
also  to  derive  schemes  for  samples  larger  than  size  one  by  considering  CUSUMs  of 
the  X  and  F,  the  minimal  sufficient  statistics  for  the  sample. 


4.2  CUSUMs  for  location 

We  use  the  decision  interval  form  for  the  CUSUM,  as  it  is  more  easily  implemented  by 
computer.  We  maintain  a  pair  of  CUSUMs  for  location;  one  for  upward  departures 
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from  the  mean  and  the  second  for  downward  departures,  denoted  by  5'*'  and  S  , 
respectively. 

For  our  CUSUM  for  location,  we  assume  that  (1)2  =  X  is  fixed  and  known.  We 
then  have  a  one-parameter  exponential  family  in  <j)i.  To  determine  our  settings  for 
the  CUSUM  chart,  we  first  decide  for  which  shifts  of  the  mean  we  wish  maximum 
sensitivity.  We  call  the  upper  value  and  the  lower  value  fi  .  The  corresponding 
values  of  (f>i  are  called  (pf  and  . 

With  the  shape  parameter  fixed,  we  have 


/(re,  (pi)  =  exp(o(re)6((?ii)  +  c{x)  -h  d{<pi)) 


with  o(re)  =  re,  d(^i)  =  y/ and  Then  the  reference  value,  ki,  is 

given  by  Equation  4.3,  which  simplifies  to 


-2y/X 

‘“"VS+v^ 


(4,6) 


where  (p  =  XfiJ?  and  (pi  is  (pf  or  4>i ,  as  appropriate. 

Rewriting  Equation  4.6  in  terms  of  the  parameters  ix  and  A,  we  obtain 


k  {m)  = 


fX  +  fXi 


(4.7) 


This  value  for  k  is  the  harmonic  mean  of  the  in-control  and  out-of-control 
parameters.  Notice  that  this  value  for  k  is  very  close  to  the  heuristic  value  one  would 
obtain  by  using  the  arithmetic  mean.  The  difference  between  the  heuristic  value  and 
the  optimal  value  for  k  is  only 

A  =  (4.8) 

^ino  +  fii) 

The  CUSUM  scheme  for  location  then  consists  of  two  decision  interval  charts, 
one  for  upward  detection  and  one  for  downward  detection.  They  are  given  by  Equa¬ 
tions  4.1,  4.2,  and  4.7  for  the  upward  CUSUM  and  Equations  4.4,  4.2,  and  4.7  for 
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the  downward  CUSUM.  In  both  cases,  T„(Xn)  =  This  scheme  has  the  advantage 

of  simplicity:  it  accumulates  the  sum  of  the  observations  themselves  less  k,  without 

the  need  for  any  transformation.  The  k  term  depends  only  on  the  mean  under  the 

null  and  alternative  hypothesis.  The  parameter  4>i  is  not  as  simple  as  the  mean  or 

variance  of  the  process,  but  it  does  have  the  characterization  as  the  process  mean 

divided  by  the  process  variance: 

^  A  fi  EX 
^  ^  Z  ^  Var{X) 

also  known  as  the  process  signal-to-noise  ratio. 

The  scheme  signals  when  either  5^  >  h'^  or  <  h  . 

h'^  and  h~  are  set  by  considering  the  average  run  length,  that  is,  the  average 
number  of  samples  until  5+  >  hA  or  S~  <  resulting  in  a  false  signal  when  the 
process  remains  in  control.  These  average  run  lengths  are  discussed  below. 

4.3  CUSUMs  for  shape 

To  derive  our  CUSUM  for  the  shape  parameter,  we  assume  fixed  and  known,  and 
again  consider  the  resulting  one  parameter  exponential  family.  AVe  take 

=  -1^ 

b(X)  =  -A/2,  and  d{X)  =  ln(A)/2.  The  sequential  probability  ratio  test  for  a  shift 
from  A  to  Aq  involves  summing  a{xi)  until  it  exceeds  a  limit  which  depends  on  the 
number  of  terms  in  the  sum. 

We  could  attempt  to  follow  the  same  procedure  as  in  the  previous  section  to 
construct  our  CUSUM  for  shape.  This  would  result  in  the  scheme  5^  =  0,  5+  = 
max(0,  +  a{Xn)  —  k),  with 

_  d(Ao)  -  d{\)  _  ln(Ao/A) 
b(Xo)  —  6(A)  Aq  ~  A 


(4.9) 
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We  prefer  to  focus  instead  on  the  fact  that  Aa(X)  Xi  derive  a  CUSUM 

scheme  for  that  distribution.  Then  a  test  for  a  shift  in  A  becomes  a  test  of  a  scale 
change  in  a  Xv  better,  a  scale  change  in  a  r(l/2, 2)  distribution,  with  the  gamma 
density  given  in  the  form 


P  °‘x°‘  ^exp(— a;/yd) 

r(a) 


(4.10) 


By  focusing  on  the  more  general  problem  of  a  scale  shift  for  a  Gamma  distri¬ 
bution,  we  get  two  CUSUM  schemes  for  the  price  of  one. 

Note  that  the  Ph.D.  dissertation  of  Regula  [Regula,  1976]  considered  a  shift  in 
the  shape  parameter,  a,  for  a  Gamma  distribution.  A  search  of  the  literature  does  not 
reveal  an  explicit  CUSUM  test  for  the  scale  parameter,  of  a  Gamma  distribution, 
although  it  follows  easily  from  the  CUSUM  for  the  variance  of  a  normal  distribution 
derived  by  Johnson  and  Leone  [1961b],  and  the  work  of  Hawkins  [1992c]. 

Let  Y  ~  r(a,  /3).  The  log-likelihood  ratio  for  the  Gamma  distribution  for  a 
test  of  ^  =  Pa  against  p  =  Po  using  Equation  4.10  simplifies  to 


An 


{Po  Pa)yi 
PoPa 


(4.11) 


Following  our  pattern  of  using  the  SPRT  to  obtain  CUSUM  schemes.  Equa¬ 
tion  4.11  motivates  the  following  CUSUM  scheme  for  the  shape  parameter  of  the 


gamma  distribution: 


So  =  0 

Sn  =  max(0,  Sn-l  +  Yn-  k) 

,  _  CxPoPaI<^iPo/Pa)  ^4  12) 

Pa-Po 

As  usual,  the  scheme  signals  when  S^  >  h,  where  h  is  found  by  determining 
the  a  priori  desired  ARL. 
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A  FORTRAN  routine  for  determining  h  for  the  scale  change  for  a  r(a,  /5)  is 
available  from  the  author. 

For  the  problem  of  the  shape  parameter  given  X  ~  A),  we  note  that  the 

distribution  of 

=  HX)  ~  xi 

We  see  that  a  shift  in  A  results  in  a  scale  shift  in  a  r(l/2,2)  and  can  be  detected 
using  the  scheme  just  derived  for  the  scale  shift  of  a  Gamma  distribution. 

Our  in-control  parameter  for  /3  is  2.  When  the  true  value  of  A  shifts  to  Aa,  the 
distribution  of  Xoa^X)  shifts  to 

So  our  alternate  point  hypothesis  for  the  CUSUM  of  the  Gamma  scale  parameter 
becomes  P  = 

Our  CUSUM  scheme  then  becomes 


5o+  =  0 

=  max(5^_i  -f-  Aoa(X„)  —  k)  (4-13) 

In  ^ 

k  =  Aot — ^  (4-14) 

Ao  Ao 

This  scheme  is  illustrated  in  Table  4.1  and  its  behavior  is  explored  in  detail  in 


the  next  section. 

We  note  that  it  is  the  same  scheme  as  the  one  we  derived  from  first  principles 
in  Equation  4.9  but  has  more  general  application  to  the  problem  of  scale  shift  for  an 
arbitrary  Gamma  distribution. 
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Ml 

A 

h 

ARL  in  control 

ARL  out-of- control 

“3:3“ 

1 

47742 

3.639 

5 

16.340 

9.730 

10 

44.877 

20.314 

20 

178.354 

47.989 

40 

1233.208 

115.569 

^0 

Ai 

h 

ARL  in  control 

ARL  out-ot-controT 

'3 

4 

“sr 

31.691 

17.44b 

10 

105.302 

40.639 

20 

532.991 

101.779 

40 

5163.017 

239.129 

Table  4.1:  Some  in-control  and  out-of  control  ARL  values  for  various  CUSUM  pa¬ 
rameters.  Out-of-control  values  are  taken  for  the  parameter  at  the  alternate  (tuning) 
value. 

4.4  ARLs  in  control 

We  provide  two  methods  for  computing  the  ARLs  of  these  schemes.  The  first,  due  to 
Jun  and  Choi  [1993]  uses  simulation  and  variance  reduction  techniques.  The  second 
uses  an  approximation  due  to  Hawkins  [1992c]  which  approximates  the  underlying 
integral  equations  to  find  the  ARL.  Table  4.1  provides  some  typical  in-control  and 
out-of-control  ARL  values.  Details  of  algorithms  and  coding  are  in  the  Appendix. 

The  variance  reduction  scheme  was  used  as  an  independent  check  on  the  au¬ 
thor’s  programming  implementation  of  the  faster,  more  accurate  integral  approxima¬ 
tion. 

Below  we  provide  graphs  of  h  versus  In(ARL)  for  the  CUSUM  chart  in-control 
and  out-of-control  at  the  alternate  value.  Similar  charts  could  be  used  to  select  h. 
Additionally,  we  provide  FORTRAN  routines  in  the  appendix  for  finding  h  for  any 
ARL  for  both  pi  and  A,  given  the  parameters  of  the  process  in-  and  out-of-control. 

It  is  interesting  to  note  from  the  graphs  that  the  rate  of  increase  in  ARL  versus 
h  appears  to  be  approximately  exponential  for  the  in-control  case  and  linear  for  the 


out-of-control  case. 
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Figure  4.3:  A  chart  of  h  vs.  ARL  for  the  CUSUM  for  detecting  shifts  in  A,  with 
/i  =  3,  Ao  =  5,  and  Ai  =  4,  with  samples  of  size  5.  The  lower  curve  is  the  ARL  for  the 
out-of-control  state.  There  is  an  anomaly  for  h  =  26,  from  the  FORTRAN  output, 
causing  the  inappropriate  dip  in  the  curve.  Note  that  this  figure  in  not  in  logarithmic 
scale. 


CHAPTER  4.  CUMULATIVE  SUM  CHARTS  FOR  IG  PROCESSES 


72 


4.4.1  Two  pedagogical  notes 

We  mention  in  passing  that  the  algorithm  for  finding  h  to  meet  a  specified  ARL  is  a 
straightforward  application  of  the  Newton- Raphson  root  finding  algorithm,  with  one 
small  twist:  the  derivatives  are  approximated  numerically,  instead  of  being  computed 
symbolically.  The  routine  is  very  fast,  usually  producing  convergence  in  a  handful  of 
steps. 

We  also  mention  that  the  variance  reduction  scheme  for  the  simulation  method 
is  accessible  to  the  introductory  (calculus-based)  statistics  student  and  provides  a  nice 
motivation  for  discussing  covariance. 

4.4.2  Commeiits  on  AR.L  as  a  measure  of  effectiveness  of  a 
CUSUM  scheme 

There  have  been  objections  in  the  literature  to  using  ARL  as  the  sole  measure  of  the 
effectiveness  of  a  scheme  [Gan,  1992]  [Bissell,  1969]  [  Barnard,  1959].  If  the  distribution 
of  the  run-length  for  the  CUSUM  was  known  for  a  particular  scheme,  it  would  clearly 
be  preferred  for  characterizing  a  CUSUM  scheme.  The  run-length  distribution  is 
known  for  some  simple  distributions,  such  as  the  exponential  [Gan,  1992a].  However, 
in  the  absence  of  knowledge  of  the  run-length  distribution  (such  as  in  the  case  of  the 
inverse  gaussian  distribution),  ARL  remains  the  accepted  method  for  evaluating  the 
performance  of  CUSUM  schemes.  The  practitioner  is  advised  to  keep  in  mind  that  the 
run-length  is  often  skewed,  and  that  the  median  run-length  may  differ  sharply  from 
the  average  run-length.  Deriving  the  exact  run-length  distribution  for  the  inverse 
gaussian  CUSUM  scheme  is  a  topic  for  later  research.  Approximating  the  run-length 

distribution  can  be  done  easily  by  simulation. 

We  recommend  that  the  practitioner  present  simulation  results  indicating  the 
general  shape  of  the  run  length  distribution  for  a  scheme,  as  well  as  the  ARL,  when 
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briefing  a  designed  CUSUM  scheme  to  the  client. 

4.5  Performance  for  small  persistent  shifts 

We  consider  the  performance  of  the  CUSUM  schemes  for  detecting  small,  persistent 
shifts  in  the  parameters.  Moustakides  [1986]  proved  that  the  schemes  are  optimal 
in  the  sense  they  have  the  smallest  expected  number  of  samples  until  a  signal,  when 
shifting  out  of  control  to  the  alternate  value,  of  all  schemes  with  similar  in-control 
false  alarm  rates.  We  construct  tables  to  illustrate  the  response  rates. 

4.5.1  Small  persistent  shifts  in  /z 

In  the  following  tables,  we  present  the  performance  of  the  CUSUM  scheme  in  detecting 
small  persistent  shifts.  We  assume  that  the  process  is  exactly  specified  a  priori. 

We  desire  some  benchmark  to  examine  just  how  optimal  is  the  optimal  proce¬ 
dure.  For  comparison,  we  use  the  CUSUM  scheme  for  the  mean  of  a  normal  distri¬ 
bution,  which  uses  the  arithmetic  mean  between  the  parameter  values  in-control  and 
out-of-control  as  its  reference  value.  Recall  that  our  optimal  CUSUM  for  the  mean  of 
an  Inverse  Gaussian  random  variable  uses  the  harmonic  mean  between  the  in-control 
and  out-of-control  values.  Since  the  harmonic  mean  is  close  to  (but  strictly  less  than) 
the  arithmetic  mean  for  small  shifts,  this  should  provide  a  reasonable  competitor. 
We  will  call  the  use  of  the  arithmetic  mean  for  k  for  the  IG  case  a  “naive”  CUSUM 
scheme,  and  the  use  of  the  optimal  harmonic  mean  the  “optimal”  scheme. 

We  tabulate  the  out-of-control  ARLs  for  both  schemes  for  some  illustrative 
values  of  ju.  A,  and  ARL  in-control.  We  use  the  one-sided  CUSUM  in  these  tables. 
Similar  results  hold  for  the  two-sided  CUSUM. 
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Table  4.2:  Comparing  performance  by  ARL  of  various  control  schemes  to  detect  a 
small  persistent  shift  in  the  mean. 


Shewhart  one-sided  ARLs  were  calculated  as 


ARL  = 


P{X  >  crit\X  ~  IG{na,  A)) 


(4.15) 


by  numeric  integration. 

The  naive  and  optimal  CUSUM  schemes  were  designed  for  their  in-control 
ARLs.  h  was  found  using  the  FORTRAN  routines  in  the  Appendix,  k  depends,  of 


course,  on  the  scheme. 

We  see  from  Xable  4.2  that  for  small  shifts  the  use  of  the  optimal  harmonic 
mean  as  a  reference  value  does  beat  the  benchmark  arithmetic  mean  for  performance, 
but  not  by  much.  This  is  not  surprising,  given  that  Equation  4.8  indicates  that  values 
of  the  harmonic  mean  and  arithmetic  mean  are  very  close  for  small  shifts. 


4.5.2  Small  persistent  shifts  in  A 

We  will  repeat  the  process  of  the  previous  section  to  obtain  a  benchmark  for  the 
CUSUM  for  A.  We  construct  a  naive  alternative  for  k  by  averaging  the  expected 
values  of  a(X)  when  the  process  is  in-control  then  out-of-control.  We  recognize  that 
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Out-ot- control 

ARLi 

A 

Parameters 

A„  ARL  in-control 

Optimal  CUSUM 

“IU“ 

~TL 

- TUD 

7inJ3 

10 

9 

100 

61.08 

10 

11 

1000 

442.12 

10 

9 

1000 

373.09 

5 

5.1 

100 

92.28 

5 

4.9 

100 

90.14 

5 

5.1 

1000 

777.17 

5 

4.9 

1000 

746.67 

100 

101 

100 

96.01 

100 

99 

100 

94.91 

100 

101 

1000 

875.24 

100 

99 

1000 

860.24 

Table  4.3:  Out-of-control  ARLs  for  a  small  persistent  shift  in  A.  Note  that  a  small 
persistent  change  in  A  can  be  very  difficult  to  detect. 

the  expected  value  of  a(X )  in  control  is  1,  and  that  when  A  =  Ao,  the  expected  value  of 
a(X)  is  E(a(X))  =  Averaging  those  two  expected  values  results  in  A:  =  1/2  -h 

These  two  values  for  k  are  very  close.  For  example,  when  A  =  10  and  A^  =  11, 

the  two  values  for  k  differ  by  only  .00144365. 

We  construct  Table  4.3  to  explore  small  persistent  shifts  in  the  scale  parameter, 
A.  We  see  that  it  is  very  difficult  to  detect  a  small  persistent  shift  in  A. 

Examination  of  Table  4.3  shows  that  the  optimal  scheme  has  shorter  out-of¬ 
control  ARLs  than  the  benchmark.  It  also  shows  that  the  optimal  scheme  in  not 
greatly  more  powerful  than  the  benchmark.  This  is  a  result  of  the  underlying  robust¬ 
ness  of  the  CUSUM  to  mis-specification  of  the  out-of-control  state  when  designing 
the  CUSUM  scheme.  One  could  look  at  the  benchmark  value  here  as  the  optimal 
value  to  detect  some  different  shift  in  the  mean,  and  see  that  the  performance  was 
still  fairly  close  to  the  scheme  designed  for  the  actual  shift  in  the  mean. 
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4.6  Conclusions 

In  this  chapter,  we  derived  the  optimal  tests  for  the  CUSUM  of  the  inverse  gaussian 
distribution  for  both  the  mean  and  shape  parameter.  In  the  process  of  doing  this, 
we  developed  an  optimal  test  for  shift  of  scale  for  the  CUSUM  of  a  Xu  random 
variable.  We  found  an  immediate  application  for  that  xf  test  in  our  test  for  the  shape 
parameter.  We  adapted  existing  FORTRAN  codes  to  evaluate  the  ARL  of  these  tests. 
We  checked  those  FORTRAN  codes  against  a  variance  reduction  simulation  scheme. 
We  compared  the  optimal  tests  against  the  usual  heuristic  and  found  that  for  small 
changes  in  the  mean  the  optimal  test  was  not  significantly  better  than  the  usual 
heuristic. 

We  developed  codes  which  allow  the  interested  party  to  design  an  optimal  or 
traditional  CUSUM  scheme,  and  to  find  its  average  run  length. 


Chapter  5 

CUSUM  Embellishments  for  IG 
processes 


In  this  short  chapter,  we  address  two  CUSUM  embellishments  and  how  they  apply 
to  the  work  in  the  previous  chapter. 


5.1  Fast  initial  response  CUSUM 

Lucas  and  Crosier  [1982]  proposed  the  fast  initial  response  (FIR)  CUSUM.  They 
reasoned  that  if  the  CUSUM  was  started  not  at  zero,  but  at  some  value  between 
0  and  the  decision- interval  value  h,  the  CUSUM  would  respond  faster  to  early  out- 
of-control  states.  However,  if  the  process  was  in-control  in  those  early  states,  the 
CUSUM  would  most  likely  return  to  zero,  and  nothing  would  be  lost. 

Lucas  and  Crosier  recommend  a  head  start  value  for  So  =  h/2,  based  on 

numerical  evidence  for  the  normal  scheme. 

The  FORTRAN  programs  discussed  in  the  previous  chapter  can  take  advan¬ 
tage  of  a  feature  of  the  CUSARL  code  of  Hawkins.  That  code  reports  the  FIR  ARL 
for  So  =  h  12  along  with  the  regular  ARL.  It  is,  therefore,  simple  to  adjust  the  pro¬ 
grams  which  find  the  ARL  for  the  A)  to  also  find  the  FIR  ARL.  Generally,  one 

pays  for  increased  speed  of  detection  when  starting  out-of-control  with  a  slightly  lower 
ARL  when  starting  in-control.  To  maintain  the  same  in-control  ARL,  one  adjusts  h 
upward  appropriately,  until  one  obtains  the  desired  in-control  FIR  ARL.  Of  course,  if 
the  process  is  initially  in-control,  but  goes  out-of-control  later,  there  will  be  a  delayed 
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response  compared  with  the  regular  ARL  as  the  CUSUM  moves  toward  the  higher 
value  of  h. 

One  should  use  the  FIR,  then,  if  one  is  concerned  that  the  process  goes  out-of¬ 
control  early.  One  obtains  that  increased  detection  power  at  the  expense  of  decreased 
power  to  detect  later  departures  from  control. 

While  the  ARL  may  not  be  significantly  affected  by  the  use  of  the  FIR  scheme, 
the  underlying  run-length  distribution  is.  One  incurs  more  frequent  short-length  runs, 
accentuating  the  already  skewed  nature  of  the  run-length  distribution.  This  could  be 
of  no  small  annoyance  to  the  process  manager,  who  has  already  had  to  learn  that  the 
mean  run  length  is  greater  than  the  median  run  length. 

One  context  where  such  a  trade-off  is  desirable  is  immediately  after  repairing 
the  process  due  to  an  earlier  signal.  If  one  has  not  correctly  diagnosed  and  repaired 
the  process,  the  process  is  still  out  of  control.  Upon  restarting  the  process,  one  would 
wish  to  know  this  quickly,  to  avoid  continuing  out  of  control. 

The  practice  of  using  FIR  CUSUMs  is  not  universally  accepted,  because  of  the 

above  trade-offs. 

A  topic  for  additional  research  would  be  to  determine  an  algorithm  for  finding 
the  optimal  head  start  value  to  meet  some  criteria.  Since  one  obtains  the  ARLs 
for  many  states  when  one  uses  the  Markov-chain  approximation  for  finding  ARLs 
(discussed  in  the  appendix  when  the  CUSARL  code  is  presented),  one  could  either 
determine  a  better  rule  to  select  a  head  start  or  validate  the  /i/2  heuristic. 

5.1.1  An  example 

Let’s  return  to  Figure  4.2,  which  illustrated  In(ARL)  versus  h  for  an  example.  In 
Figure  5.1,  we  add  the  FIR  scheme  for  the  same  parameter  values.  This  means  we 
start  at  /i+/2  and  Sq  =  h“/2.  We  see  that  the  out-of-control  ARL  is  greatly 
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0  5  10  15 

h.  --  decision  interval 


Figure  5.1:  Comparison  of  FIR  and  regular  ARL  schemes,  with  —  3.0,  ni  —  3.5, 
A  =  5,  and  n  =  5,  where  n  is  the  size  of  the  rational  subgroup.  The  FIR  response 
is  the  lower  of  each  pair  of  curves;  the  upper  pair  the  in-control  ARL  and  the  lower 
pair  the  out-of- control  ARL. 
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Table  5.1:  Table  of  comparable  values  for  regular  and  FIR  CUSUMs,  with  fio  —  3.0, 
/ii  =  3.5,  A  =  5,  and  n  =  1,  where  n  is  the  size  of  the  rational  subgroup. 


Value 

Regular  CUSUM 

FIR  CUSUM 

k 

3.2307 

3.2307 

h 

37.5619 

38.8170 

-50 

0 

19.4085 

ARL  in-control 

1000 

1000 

ARL  out-of-control 

106.89 

75.2727 

reduced,  while  the  in-control  ARL  is  not  affected  nearly  so  much.  However,  if  the 
CUSUM  returns  to  zero  (a  regeneration  point),  the  ARL  then  becomes  much  greater 
that  the  standard  CUSUM  scheme  because  of  the  higher  value  of  h,  the  decision  value. 

As  a  more  focused  example,  we  look  at  the  design  parameters  for  an  ARL  of 
exactly  500  for  this  scheme.  Using  a  routine  available  from  the  author,  we  obtain  the 
information  in  Table  5.1. 

If  the  FIR  CUSUM  returns  to  zero  before  reaching  h,  the  ARL  for  the  scheme 
in  Table  5.1  becomes  1114.7690  in-control.  The  out-of-control  ARL  increases  as  well, 
to  111.3589. 

Accordingly,  we  see  the  fast  response  feature  of  the  FIR  only  applies  if  the 
CUSUM  goes  out  of  control  before  returning  to  zero.  If  the  CUSUM  does  return 
to  zero,  the  scheme  performs  worse  than  the  regular  ARL.  One  should  then  use  the 
FIR  with  caution.  The  FIR  is  still  a  good  choice  for  starting  the  scheme  after  a 
signal  if  there  is  doubt  as  to  whether  or  not  the  condition  causing  the  signal  has  been 


corrected. 
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5.2  Self-starting  CUSUM  for  the  mean 

The  arguments  in  Section  2.4.2  for  self-starting  charts  for  the  Shewhart  Control 
scheme  apply  with  greater  strength  to  the  CUSUM  scheme,  which  was  their  orig¬ 
inal  context.  In  fact,  our  attention  was  first  drawn  to  self-starting  schemes  in  the 
article  by  Hawkins  [1987],  who  specifically  addressed  self-starting  CUSUM  charts  of 
normal  variates. 

We  will  use  the  same  transformations  in  the  self-starting  CUSUM  scheme  that 
we  used  in  the  self-starting  Shewhart  scheme  for  the  mean.  That  is,  we  will  CUSUM 

Z  =  ^-^FriT))  ~  iV(0,l) 

with  $"^(2)  being  the  inverse  CDF  for  the  standard  normal  variate.  Recall 

\Jn\n2{n\  -1-  n2  —  2)  (X  —  Y) 

^Jx  y(niX  -I-  n2Y){Vi  +  V2) 

Our  self-starting  CUSUM  scheme,  in  control,  becomes  a  CUSUM  of  iV(0, 1) 
variates. 

The  sole  diflaculty  becomes  the  computation  of  the  reference  value,  k.  The 
distribution  of  Ti  when  there  is  a  model  departure  is  not  known,  depends  on  the 
length  of  time  the  process  has  been  running  in  control,  and  appears  likely  to  be  too 
complicated  to  be  worth  the  effort  to  find  it. 

We  note  that  Hawkins  [1987]  finessed  the  issue  in  his  paper.  He  expressed  the 
shift  of  the  mean  in  the  original  variable  as  a  multiple  of  standard  deviations,  and 
used  one-half  that  value  as  the  reference  value.  Since  his  intermediate  studentized 
residuals  can  be  thought  of  as  approximating  the  number  of  standard  deviations  away 
from  the  unknown  mean,  this  seems  reasonable.  The  additional  transformation  to  the 
iV(0, 1)  distribution  does  not  radically  affect  this  heuristic.  This  approach  avoided 
the  difficulty  of  finding  the  k  in  the  context  of  the  transformed  variables. 
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Since  we,  by  contrast,  are  using  an  exact  transformation  to  normal  variables, 
we  fall  back  on  standard  procedure.  We  will  set  k  as  being  the  average  between  the 
expected  value  of  Z  in  control  and  the  expected  value  of  Z  when  the  process  is  out  of 
control  at  some  level.  When  the  process  is  in  control,  the  expected  value  of  Z  is  0  and 
the  standard  deviation  of  Z  is  1.  We  appeal  to  the  robustness  of  the  CUSUM  reference 
value  and  select  the  out-of-control  mean  value  to  be  0.1.  This  value  is  obtained  from 
a  simulation  of  various  out-of-control  states  at  various  points,  computing  the  value  of 
Z  for  the  first  observation  out-of-control.  While  it  is  a  very  rough  approximation,  the 
CUSUM  is  known  to  be  robust  to  mis-specifications  of  the  out-of-control  state.  This 
results  in  a  reference  value  of  A:  =  0.05.  When  the  underlying  IG  process  goes  out  of 
control,  both  the  mean  and  the  standard  deviation  of  the  Z  change.  Depending  on 
how  long  the  process  has  been  running,  these  will  change  either  slowly  or  rapidly  back 
to  N{0, 1)  as  the  process  fails  to  signal  over  time.  However,  if  the  process  has  been 
running  an  appreciable  time,  the  drift  back  to  the  original  expected  value  should  be 
relatively  slow. 

If  we  don’t  know  the  distribution  of  the  summand  for  the  CUSUM  when  the 
process  is  out  of  control,  it  is  not  possible  to  analytically  derive  an  optimal  CUSUM 
scheme.  If  we  were  concerned  with  finding  a  better  reference  value  for  a  given  out- 
of-control  state,  we  could  simulate  to  approximate  the  expected  value  of  $“^(Fo(r)), 
and  then  use  the  rule  for  shifts  of  means  of  normal  variates  which  sets 

Mo  +  Ml 
2 

Since  under  a  model  shift,  Fo{T)  is  no  longer  distributed  uniformly  and  it  follows 
that  #~^(Fo(r))  is  not  normal,  this  is  at  best  an  approximation. 

However,  we  are  proposing  this  self-starting  CUSUM  as  a  reasonable,  not 
optimal,  scheme.  We  defer  further  discussion  of  optimality  for  subsequent  work,  and 
proceed  to  examples. 
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As  an  example,  consider  the  following  scenario.  Let  our  true  (but  unknown) 
process  parameters  be  /i  =  42,  A  =  66.  We  will  look  at  samples  of  size  1,  and 
at  observation  50  we  will  change  the  distribution  to  JG(52,  66),  corresponding  to  a 
moderate  shift  in  the  process. 

The  self-starting  CUSUM  is  iUustrated  in  Figure  5.2.  We  run  in-control  for 
50  cases  with  an  JG'(42, 66).  We  will  go  out  of  control  at  observation  51  and  beyond, 
moving  to  an  JG(52,66).  We  set  our  ARL  =  100.  Since  we  have  set  k  =  .05,  we 
have  /i+  =  =  10.2969.  In  this  case,  the  self-starting  CUSUM  signaled  at  the  66th 

observation,  or  16  observations  after  the  process  went  out  of  control. 

For  short  start-ups  before  going  out  of  control,  it  is  not  unusual  for  the  scheme 
to  fail  to  detect  the  change.  The  example  above  can  be  considered  a  short  training 
set,  since  50  points  corresponds  to  only  10  samples  of  size  5. 

Performance  is  better  for  longer  training  sets,  as  indicated  in  Figure  5.3. 

Note  from  Figure  5.3  that,  even  with  the  long  start-up  of  150  observations, 
the  self-starting  scheme  begins  to  adjust  to  being  out-of-control.  This  can  be  seen 
from  the  CUSUM  for  which  starts  to  tail  back  to  the  horizontal  center  line  after 
observation  225  or  so,  despite  remaining  out-of-control. 

For  the  behavior  of  the  out-of-control  CUSUMs,  including  run-length  distri¬ 
bution,  we  must  resort  to  simulation,  since  the  CDF  for  the  distribution  of  the  out- 
of-control  Z  is  not  known. 

We  reiterate  that  the  ARL  will  depend  not  only  on  the  size  of  the  shift,  but  the 
length  of  time  the  process  has  been  running  in-control  prior  to  going  out  of  control. 
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Figure  5.2:  A  combined  self-starting  Shewhart  and  self-starting  CUSUM  chart.  The 
process  ran  in-control  (/G(42,  66))  for  50  samples  of  size  1,  then  went  out  of  control 
to  7 G (52, 66).  The  chart  signals  at  observation  66  for  the  CUSUM,  and  observation 
80  for  the  Shewhart. 
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Figure  5.3:  Self-starting  CUSUM  for  the  mean  of  an  /G  random  variable.  The  process 
is  in-control  at  /G(42, 66)  until  observation  150,  when  it  shifts  to  JG(52, 66).  The 
change  is  signaled  at  observation  193. 
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5.3  Conclusions 

We  have  seen  how  two  standard  embelhshments  can  be  applied  to  the  CUSUM  scheme 
for  processes  well  modeled  by  the  inverse  gaussian  distribution.  The  fast  initial  re¬ 
sponse  allows  improved  sensitivity  to  early  departures  from  control,  at  the  expense  of 
slightly  slower  response  to  later  departures.  The  self-starting  CUSUM  allows  control 
of  the  process  in  the  early  stages,  before  enough  historical  data  has  been  gathered  to 
firmly  establish  the  in-control  parameters  of  the  process. 


Chapter  6 

Bivariate  Shewhart  control  charts 


In  this  chapter,  we  generalize  the  idea  of  using  highest  probability  density  regions 
(HPD  regions)  to  control  two  parameters  of  a  process  simultaneously.  Doing  so  gives 
the  tightest  possible  control  limits.  This  can  be  useful  in  many  contexts.  We  also 
address  a  vexing  diagnosis  problem  found  with  traditional  rectangular  charting  meth¬ 
ods. 


6.1  Current  practices 

The  traditional  control  chart  scheme  maintains  separate  charts  for  each  process  pa¬ 
rameter.  For  example,  the  charts  for  normally  distributed  processes  include  one  for 
location  and  one  for  scale.  If  either  chart  signals,  the  process  is  deemed  out  of  control. 
If  the  scale  chart  signals,  the  process  is  deemed  to  have  undergone  a  scale  shift.  If 
the  location  chart  shifts,  however,  it  is  not  immediately  clear  whether  that  is  due  to 
a  location  shift  or  a  scale  shift.  In  that  case,  the  accepted  practice  is  to  first  examine 
the  scale  chart  to  see  if  there  is  any  indication  of  a  scale  shift.  In  its  absence,  only 
then  does  one  assume  that  there  has  been  a  location  shift. 

The  operation  of  two  charts  as  above  is  similar  to  running  one  bivariate  chart 
with  rectangular  limits.  See  Figure  6.1.  If  a  point  plots  inside  the  rectangular  limits, 
then  the  process  is  assumed  to  be  in  control.  If  the  process  plots  in  region  I  (signaling 
a  scale  shift  )  the  process  is  assumed  out-of-control  for  scale.  The  process  is  only 
considered  out  of  control  for  location  if  a  sample  is  plotted  in  region  II.  See,  for 
example,  Montgomery  [1991],  who  says,  “Never  attempt  to  interpret  the  X  chart 
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Figure  6.1:  A  bivariate  Shewhart  contro 
out-of-control  regions  are  labeled  I  and 
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1  chart  with  rectangular  limits.  Two  possible 
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when  the  R  chart  indicates  an  out-of-control  condition” .  (An  R  chart  is  another  type 
of  control  chart  for  scale,  not  discussed  here.) 


6.2  Improved  diagnosis  of  an  out-of-control  signal 


This  algorithm  can  be  improved  by  dividing  the  out-of-control  region  into  three  areas, 
instead  of  two.  We  will  classify  out-of-control  points  as  either  a  location  shift,  a  scale 


shift,  or  both. 

Our  approach  will  be  based  on  likelihood  ratios.  Our  first  step  is  to  hypothesize 
that,  given  an  out  of  control  signal,  both  the  location  and  scale  have  shifted  to  their 
(now)  most  likely  values,  the  MLEs.  We  then  construct  regions  indicating  a  scale 
shift  or  a  location  shift  only,  based  on  a  likelihood  ratio  with  the  parameters  being 
either  both  of  the  MLEs  (signaling  both  parameters  have  shifted)  or  just  one  of  the 
MLEs  (corresponding  to  only  one  parameter  shifting) .  We  examine 


^location 


^  scale 


f{x,  s'^\n  =  x,  =  q-q) 
f{x,  s^\(M  =  x,a^  =  s^) 
f{x,  s^l/x  =  /X0,(7^  =  s^) 
f{x,s‘^\lJ,  =  x,a^  =  s^) 


(6.1) 

(6.2) 


If  either  A  is  greater  than  some  critical  value,  we  reject  the  hypothesis  that  both 
parameters  have  shifted  in  favor  of  the  hypothesis  that  only  one  has  shifted.  This 
means  that  the  likelihood  of  a  shift  of  just  one  parameter  is  not  much  smaller  than 
the  likelihood  of  both  parameters  shifting. 

For  the  normal  case,  we  use  charts  for  the  mean  and  sample  variance.  Then 


— 21n  Asca/e  =  — nln(s^)  +  2nln((Jo)  +  (n  —  l)(s^  —  Oq)Ioq  (6-3) 


which  does  not  depend  on  x.  This  implies  that  if  we  set  -21n  AscoZe  <  c,  we  obtain 
boundaries  parallel  to  the  x  axis. 
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On  the  other  hand, 

-2  In  Aiocatian  =  ^  --  (6-4) 

which  implies  that  accepting  the  hypothesis  of  a  mean  shift  only  depends  on  both  x 
and  and  we  obtain  an  acceptance  region  bounded  by  a  quadratic  curve. 

The  distribution  of  —2  In  Aiocatian  is  either  central  or  non-central  de¬ 

pending  on  whether  the  mean  has  shifted  or  not.  The  distribution  of  —2\nAscaie  is 
not  a  standard  one,  and  also  depends  on  the  correct  specification  of  Appealing 
to  asymptotic  theory  for  the  distribution  of  —  21nAscaZe  ~  Xi  [Bickel  and  Doksum, 
1977], we  obtain  working  critical  values. 

For  the  test  of  location  shift  against  both  location  and  scale  shift,  we  use  the 
appropriate  critical  value  from  the  F  distribution. 

Note  this  .05  significance  level  is  not  for  the  test  of  whether  the  process  has 
gone  out-of-control;  rather,  it  is  for  the  discriminating  between  the  hypotheses  that 
both  parameters  have  shifted  or  only  one  has  shifted  given  that  we  already  have  a 
signal  that  the  process  is  out-of-control.  Loss  considerations  could  motivate  us  to 
select  other  critical  values,  if  there  were  higher  costs  associated  with  mis-diagnosis  of 
one  state. 

For  example,  using  a  .05  significance  level,  /io  =  0,  cr  =  1,  and  n  =  5,  we  obtain 
a  xl  05  critical  value  of  9.4877.  Setting  Equation  6.2  equal  to  the  critical  value,  we 
obtain  two  roots:  =  .07132  and  =  5.5037.  Setting  Equation  6.1  equal  to  ferity 

we  obtain  a  parabola: 

Further  exploration  of  the  exact  critical  values  is  deferred  for  future  work.  We 
note,  however,  that  from  the  preceding  paragraphs  the  shape  of  the  regions  is  known. 
This,  coupled  with  approximation  theory  and  simulation,  allows  for  the  selection  of 
reasonable  approximations  to  the  exact  boundaries. 

We  use  these  regions  to  determine  our  diagnostics.  Say  our  rectangular  region 
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has  been  constructed  to  have  an  a  significance  level.  Then,  given  we  have  an  out-of- 
control  signal,  if  e  (.07132,5.5037),  we  have  a  mean  shift  only.  If  >  .6486x^, 
we  have  a  scale  shift  only.  Otherwise,  we  proceed  under  the  conclusion  that  we  have 
both  a  scale  and  a  location  shift.  Figure  6.2  illustrates  the  regions. 

The  diagnostic  boundaries  are  derived  without  regard  to  the  control  bound¬ 
aries.  Accordingly,  it  is  possible  to  have  an  out-of-control  signal  which  does  not  fall 
in  the  diagnostic  regions.  We  shall  see  an  example  of  this  later.  Accordingly,  there  is 

a  fourth  diagnostic  state:  indeterminate. 

In  summary,  we  are  able  to  classify  points  out-of-control  as  arising  from  a 
mean  shift  only,  a  scale  shift  only,  both  location  and  scale,  or  unknown.  Here  we 
have  used  “location”  and  “scale”  to  represent  the  parameters  controlled.  The  same 
process  applies  to  the  control  of  other  parameters,  such  as  shape. 


6.3  HPD  bivariate  control  regions 

Rectangular  hmits  are  known  not  to  give  the  smallest  possible  region  for  a  given 
significance  level.  The  HPD  region  gives  that  smallest  possible  region.  However,  the 
HPD  region  is  not  rectangular,  and  is  therefore  not  the  intuitive  first  choice  when 
dealing  with  a  pair  of  statistics  for  location  and  scale,  especially  when  the  statistics 

are  independent  when  the  process  is  in  control. 

We  propose  the  following  scheme.  First,  determine  the  HPD  in-control  or 
acceptance  region  for  the  joint  density  of  the  sampling  distributions.  This  region  will 
have  the  form 

Rk  = 

where  fic(x,  y)  is  the  in-control  density,  k  is  found  numerically  or  by  simulation  for  a 
given  significance  level. 


CHAPTER  6.  BIVARIATE  SHEWHART  CONTROL  CHARTS 


92 


Figure  6.2:  Diagnostic  regions  for  bivariate  Shewhart  chart.  Given  a  signal  out-of¬ 
control,  we  classify  the  signal  as  either  a  location  shift,  scale  shift,  or  both,  depending 
where  the  signal  is  located.  These  boundaries  are  for  a  iV(0, 1)  in  control.  The 
rectangular  control  region  is  omitted. 
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Figure  6.3:  Spin  plot  of  X,  V,  and  /(X,  F)  for  1000  samples  of  size  five  from  an 
IG{3,5)  distribution. 

We  propose  this  scheme  for  any  bivariate  distribution,  but  we  will  demonstrate 
it  with  an  example  using  the  inverse  gaussian  sampling  distribution. 

By  illustration,  we  offer  Figures  6.3,  6.4,  and  6.5.  Figure  6.3  is  a  three- 
dimensional  spin  plot  of  X,  V  and  f{X,V)  for  1000  samples  of  size  5  from  an  /G(3,  5) 
distribution.  Figure  6.5  is  a  plot  of  the  solution  to  /(X,  V)  —  k,  with  k  selected  so 
that  P(/(X,  V)<k)  =  .01. 

Second,  plot  the  bivariate  observations  as  they  occur,  signaling  if  the  bivariate 
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Figure  6.4:  HPD  out-of-control  region  by  simulation.  Spin  plot  of  X,V,  and  /(X,  V) 
for  1000  samples  of  size  five  from  an  IG{S,  5)  distribution,  censored  to  show  the  100 
points  with  the  smallest  values  of  f{X ,V),  rotated  for  effect. 
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Figure  6.5:  Graph  of  f{X,  V)  =  k,  where  P{f{X,  V)  >  k)  =  .01,  for  samples  of 
size  five  from  an  /G(3,  5)  distribution.  The  exterior  of  the  curve  is  the  HPD  rejection 
region.  Compare  with  Figure  6.4. 
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observation  is  outside  the  HPD  acceptance  region.  Interpretation  of  the  signal  will  be 
discussed  below.  Third,  to  reduce  chart  clutter,  use  a  weighting  scheme  that  reduces 
the  intensity  of  the  plotted  points  on  the  (electronic)  chart  until  points  older  than 
a  given  number  of  observations  disappear.  Make  the  most  recent  observation  green, 
make  any  out  of  control  observations  red;  make  all  others  black. 

Calculation  of  the  bivariate  HPD  region  requires  solution  of  an  integral  equa¬ 


tion:  find  k  such  that 


/  f{x,y)dxdy 
J  Rh 


for  p  =  1  -  Oi.  A  solution  to  Equation  6.6  is  found  either  by  a  numerical  search 
method,  based  on  Newton-Raphson,  or  by  a  routine  to  simulate  the  distribution  of 
y(X,  y)  and  determine  the  appropriate  quantile.  This  approach  is  similar  to  the  one 
used  earlier  to  find  the  HPD  region  for  the  predictive  schemes. 

Calculation  of  the  in-control  ARL  is  not  necessary:  one  simply  inverts  the 


significance  level. 

Calculation  on  the  out-of-control  ARL  is  easily  (if  slowly)  accomplished  using 
a  numerical  integration  routine.  Let  fic{x,  y)  be  the  joint  sampling  density,  as  before, 
when  the  process  is  in  control,  and  foc{x,  y)  the  joint  density  when  the  process  is  out 
of  control.  Let  Ik,fiAx,y)ix,  y)  be  the  indicator  function  which  is  1  if  fic{x,  y)  <  k.  One 
finds: 

OO  OO 

I  —  I3  =  P(signal|  out-of  control)  =  j  j  Ik,fic{x,y){x^^y)foc{^>y)  dx  dy  (6.7) 

—OO  — OO 

One  can  find  the  ARL  for  any  out-of- control  state  using  Equation  6.7,  since 
the  out-of-control  ARL  will  be 

This  approach  for  declaring  out-of-control  regions  is  applicable  to  any  distri¬ 
bution.  We  shall  provide  examples  using  two  distributions:  the  normal  distribution 
and  the  inverse  gaussian  distribution. 
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6.3.1  An  example 

Here  is  an  example.  Let  X  N{fx,(T^).  We  take  samples  of  size,  say,  5.  Normal 
practice  would  chart  X  and  S^.  X  ~  N(/Li,  y)  ^  ~  sample 

statistics  are  independent.  Therefore,  the  joint  density  in-control  of  (X ,  S  )  is  given 
by  the  product  of  their  marginal  densities.  Now,  in-control,  let  fj,  =  0  and  a  =  1. 
Then  the  joint  density  is: 


fic{x>  y)  = 


2-v/l0j/exp(— 2j/  —  2.5a;^) 


(6.8) 


with  X  =  x  and  y  =  s"^. 

We  set  the  ARL  at  100,  giving  a  =  .01.  Using  an  Xlisp-Stat  routine  available 
from  the  author,  we  determine  k  by  simulation,  obtaining  k  =  0.004845. 

Now  let  the  out-of-control  distribution  be  given  by  A  ~  N(l,  1).  The  out-of¬ 
control  ARL  is  given  by  Equation  6.7,  which  simplifies  to  ARL  =  3.98. 

Figure  6.6  illustrates  the  situation.  Level  curves  for  the  in-control  density  are 
plotted,  along  with  one  out-of-control  level  curve. 


6.4  Implementation  of  basic  scheme 

There  are  three  computational  issues  when  implementing  this  scheme. 

First,  finding  the  appropriate  value  of  k  requires  solving  a  two  dimensional 
integral  equation  involving  indicator  functions.  This  can  only  be  done  numerically, 
requires  iteration,  and  is  verj”^  slow,  especially  when  using  double  precision  arithmetic. 
We  used  simulation  techniques  for  a  fast  approximation. 

Second,  plotting  the  control  HPD  fimits  requires  an  ability  to  plot  implicitly 
defined  functions  of  the  form  f{x,y)  =  k,  which  is  not  supported  by  all  graphical 
packages,  particularly  XLISP-STAT. 
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Center  Delete  Help  Move  Options  Plot  Quit  Range  Scale  Transfer 


IlindoiiT  aXes  Zoon 

Enter  option  X 

Cross  jc:2. 25  y:0  Scale  x:0.5  y:l  Derive  2D~p lot 

Figure  6.6!  Level  curves  for  the  bivarie^te  Shewhe-rt  chart  normal  example.  The  in- 
control  curves  are  from  a  iV(0, 1)  distribution.  The  single  out-of- control  curve  is  from 
a  Ar(l,2^)  distribution. 
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Last,  using  the  technique  of  varying  colors  and  point  intensity  requires  some 
fairly  advanced  programming  skills. 

We  have  a  patchwork  approach  to  implementation.  First,  we  can  find  k  using 
a  FORTRAN  program  (again  available  from  the  author)  that  runs  very  slowly,  or  by 
simulation.  Second,  once  we  have  k,  we  plot  the  control  limits  using  Derive,  which 
easily  plots  the  implicit  functions.  Third,  we  do  the  actual  plotting  of  points  in 
XLISP-STAT  which,  allows  the  coloring  of  points  and  varying  of  intensity.  We  were 
unable  to  find  and  unwilling  to  develop  a  routine  to  do  implicit  plots  in  XLISP-STAT. 

With  additional  programming  effort,  the  three  tasks  could  be  combined  in  one 
software  application,  but  the  patchwork  is  sufficient  for  demonstration  purposes. 


6.5  Comparison  with  traditional  charts 

We  will  compare  the  performance  of  the  bivariate  chart  proposed  here  with  that  of 
the  traditional  Shewhart  chart. 

Both  charts  have  the  same  in-control  behavior,  since  both  are  constructed  for 
the  same  significance  level.  When  the  k  level  for  the  HPD  is  found  by  simulation,  the 
in-control  ARL  may  not  be  exact.  The  ARL  for  the  estimated  k  should  be  checked 
to  assure  ourselves  that  there  is  not  a  significant  departure  from  the  design  ARL. 

We  compare  charts  by  their  out  of  control  ARL  for  various  combinations  of 
out-of-control  parameters. 

6.5.1  Normal  case 

For  our  normal  case,  we  will  use  X  ~  A(0, 1)  as  our  in-control  distribution.  We  take 
samples  of  size  5.  We  design  for  an  ARL  of  100. 

Consider  first  our  bivariate  Shewhart  method.  Then  our  joint  density  was 
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given  in  Equation  6.8  for  the  case  n  =  5,  ^  —  0,  and  a  —  1.  We  earlier  found 
k  =  .004845. 

For  our  various  cases,  we  then  compute  (numerically  or  by  simulation)  our 
out-of-control  ARL,  by  solving  the  integral  equation  in  Equation  6.7. 

We  compare  this  with  the  standard  normal  case.  For  each  of  the  two  charts, 
we  choose  the  significance  level  o-i  =  1  —  so  the  rectangular  region  has  the 

same  significance  as  our  bivariate  chart. 

Our  control  hmits  for  X  are  given  by  /x  ±  Za-itO'fVn,  which  in  this  example 

reduces  to 

-2.8062/\/5  <  X  <  2.8062/\/5 


Our  control  limits  for  5^  are  given  by 

yn  —  1  n  —  1  J 

which  in  this  example  reduces  to 

.145053/4  <  <  16.4183/4 

Our  out-of-control  ARL  for  the  standard  charts  in  this  example  is  found  as 
follows; 

=  P  ((-1.645/V^<  X  <  1.645/\/5)  IX  ~  iV(/x,cr))  (6.9) 

//. 14503  o  16-4183\_  X 

PS2  =  P  ( ( — ^  <  — I — )  1^  ~  j 

ARL  =:  - - - -  (6-11) 

1  -  Px  X  P52 

We  tabulate  some  of  these  values  for  various  out-of-control  cases  in  Table  6.1. 
We  obtained  the  value  of  k  for  the  bivariate  case  by  simulation,  which  explains  why 
the  ARL  is  not  exactly  100  for  the  in-control  case. 
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a 

ARL-bivariate 

ARL-standard 

IT 

- mTZO 

lUU.UO 

1 

1 

3.98 

3.47 

2 

1 

1.07 

1.05 

.5 

1 

21.97 

19.79 

0 

1.5 

4.22 

5.75 

0 

.5 

8887. 

28.76 

1 

1.5 

2.00 

2.32 

1 

.5 

11.9914 

6.3622 

Table  6.1:  Comparison  of  ARLs  for  bivariate  Shewhart  and  a  pair  of  standard  charts. 
ARLs  are  given  in  number  of  samples  of  size  5.  The  in-control  distribution  is  Ar(0, 1). 

We  see  from  the  Table  6.1  that  we  have  improved  power  for  increases  in  cr  , 
reduced  power  for  decreases  in  cr^,  and  comparable  power  for  detection  of  shifts  in 
location.  We  also  see  that  we  have  improvement  in  our  detection  of  simultaneous 
shifts  in  fi  and  increases  in  a. 

As  an  example  of  the  charting,  we  take  samples  of  size  5,  with  the  process 
in-control  as  A  iV(0, 1).  It  shifts  at  observation  10  to  Xo  A  (1,1).  lA/e  show 
the  charts  for  observations  9  onward  in  Figures  6.7  through  6.13.  For  this  black  and 
white  printer,  the  points  in  control  are  marked  by  “"’’s  of  various  sizes,  with  larger 
points  being  more  recent.  Also,  due  to  diflSculty  implementing  the  plotting  implicit 
functions,  the  control  limit  curve  is  not  shown.  The  points  out  of  control  are  marked 
by  red  “o”s.  The  most  recent  point  is  given  by  if  in-control.  To  avoid  clutter, 
only  the  9  most  recent  in-control  points  are  displayed.  Out-of-control  points  remain 
visible  indefinitely. 

When  the  process  goes  out  of  control,  we  are  able  to  diagnose  the  out-of-control 
point  using  the  rules  portrayed  in  Figure  6.2  to  see  that  we  most  likely  have  a  mean 
shift  only. 
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Figure  6.8:  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  10.  Notice 
only  the  most  recent  9  points  plot. 
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Figure  6.9:  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  11.  The 
process  is  now  out  of  control  at  N(l,  1).  The  chart  has  not  yet  signaled. 
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Figure  6.10:  Bivariate  Control  chart  for  Nornaal  Example,  up  to  observation  12.  The 
process  is  now  out  of  control  at  N (1, 1).  The  chart  has  not  yet  signaled. 
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Figure  6.11:  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  13.  The 
process  is  now  out  of  control  at  N  (1, 1).  The  chart  has  not  yet  signaled. 
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Figure  6.12:  Bivariate  Control  chart  for  Normal  Example,  up  to  observation  14.  The 
process  is  now  out  of  control  at  N  (1, 1).  The  chart  has  not  yet  signaled. 
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Figure  6.13:  Bivariate  Control  chart  for  Example,  up  to  observation  15.  The  process 
is  out  of  control,  and  has  now  signaled  out-of-control.  Applying  the  rules  developed 
in  Figure  6.2,  we  diagnose  a  mean  shift  only. 
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6.5.2  IG  case 

We  now  turn  to  the  IG  case,  and  develop  bivariate  Shewhart  charts  based  on  the 
HPD  for  this  distribution. 

The  joint  density  for  X,  V  is  given  by 


f{x,v)  = 


-X  ^/^exp 


We  will  define  our  control  region  by  using  the  highest  probability  density.  We  define 
Rk  =  {(x,  v)\f{x,  v)  >  k}  and  set  k  so  that  ARL  =  p^(xv)^R^^  desired  ARL. 

As  with  the  normal  case,  this  requires  finding  a  solution  to  Equation  6.6.  This  is  no 
less  difficult  than  in  the  normal  case.  We  approximate  k  by  simulation. 

We  use  a  similar  diagnostic  scheme,  based  on  likelihood.  Using  —2  In  A,  we 
obtain  diagnostic  regions  for  V  and  X  when  the  process  signals  out  of  control. 

If  21n(n  -  1)”/^  -  nln(Au)  +  Au  -  n  +  1  <  c,  we  declare  the  process  out  of 
control  for  a  scale  shift.  This  equation  will  have  two  roots,  so  we  obtain  a  region  of 
the  form  U  <  ci  or  F  >  C2. 

-we  declare  the  process  out  of  control  for  a  mean  shift  only. 
This  equation  will  be  a  quadratic,  and  is  similar  in  form  to  the  equation  we  obtained 
for  the  normal  diagnostic  curve. 

As  before,  if  we  don’t  signal  an  exclusive  shift,  we  assume  that  both  parameters 
have  shifted. 


6.5.3  An  example 

We  assume  that  we  have  an  IG{S,  5)  process  in-control.  We  draw  samples  of  size  5. 
We  desire  an  ARL  of  100.  By  simulation,  we  determine  that  k  =  3,0  10“^.  Our 
HPD  and  diagnostic  lines  are  plotted  in  Figure  6.14.  We  run  the  bivariate  chart  for 
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Figure  6.14:  Bivariate  HPD  regions  for  an  IG{3,  5)  with  samples  of  size  5.  The  in¬ 
control  region  is  shaded.  Out  of  control  areas  are  labeled  with  their  diagnosis.  Note 
the  scales. 
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Figure  6.15:  Long  run  Bivariate  HPD  chart  for  IG{3,  5)  with  1000  observation.  Only 
the  last  9  in-control  points  are  plotted.  There  are  10  outliers,  marked  with  diamonds. 

1000  observations  in  control,  plotting  as  before  the  last  9  observations  plus  all  out-of¬ 
control  observations.  We  observe  10  points  out-of-control,  illustrated  in  Figure  6.15. 
This  is  exact  agreement  between  observed  and  predicted  number  of  out-of-control 
observations. 

We  shift  the  process  to  an  IG(Q,A)  and  continue  to  take  samples  of  size  5.  In 
our  1000  points,  we  observe  299  observations  signaling  out-of-control.  The  chart  for 
this  run  is  shown  in  Figure  6.16. 
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Figure  6.16:  Out-of-control  bivariate  HPD  chart,  with  1000  observations. 
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6.6  Interpretation  of  a  signal 

When  an  observation  is  out  of  control,  we  can  automate  the  diagnosis  by  including  the 
classification  algorithm  in  the  computing  code.  When  the  process  signals,  the  point  is 
plotted  in  an  out-of- control  region,  which  visually  indicates  to  the  operator  an  initial 
diagnosis.  We  can  also  print  the  out-of-control  values,  and  report  the  hkelihood  ratios 
for  a  mean  shift  only  against  a  shift  in  both  mean  and  scale,  and  a  scale  shift  only 
against  a  shift  in  both. 

With  any  diagnostics  based  on  likelihood,  it  is  possible  for  an  observation  out 
of  control  for  one  reason  to  plot  in  a  different  region.  Operators  should  be  especially 
alert  to  this  possibility  when  the  out-of-control  observation  is  close  to  a  boundary. 

6.7  Conclusions 

We  have  proposed  and  developed  two  ideas  in  this  chapter.  First,  we  have  advocated 
using  bivariate  control  charts  to  control  processes  to  allow  better  diagnosis  of  out- 
of-control  signals.  We  applied  this  method  to  both  the  normal  and  inverse  gaussian 
distribution.  Second,  we  have  examined  using  the  HPD  region  as  the  control  region 
for  these  bivariate  charts. 

In  many  contexts,  one  follows  the  same  procedures  to  react  to  an  out-of¬ 
control  process  regardless  of  its  suspected  cause.  In  other  contexts,  one  proceeds  quite 
differently  based  on  the  initial  diagnosis.  In  those  contexts,  the  improved  diagnostic 
tools  of  this  chapter  save  time  and  money  by  directing  the  corrective  actions  first  to 
the  most  likely  cause. 


Chapter  7 

Application  to  combat  models 


In  the  next  two  chapters,  we  apply  the  tools  we  have  developed  to  problems  of  interest. 

In  this  chapter,  we  examine  control  of  software  revisions.  In  the  next  chapter,  we  look 
at  an  automobile  assembly  hne. 

7.1  Background 

There  are  two  major  approaches  to  modeling  combat.  The  first,  originated  by  Fred¬ 
erick  Lanchester  [1956]  at  the  turn  of  the  century,  represents  combat  by  differential 
equations.  The  second,  currently  popular,  involves  high  detail  computer  simulations. 
Each  suffers  from  weaknesses.  In  this  section,  we  propose  a  hybrid  model  based  on 
the  inverse  gaussian  distribution,  which  captures  some  of  the  advantages  of  both. 

Given  this  hybrid  model,  we  can  monitor  simulated  or  actual  operations  to 
detect  model  changes,  using  the  tools  developed  in  the  preceding  chapters. 

Many  authors  have  attempted  to  model  combat.  The  first  and  arguably  most 
influential  was  Frederick  Lanchester.  He  proposed  simple  differential  equation  models 
for  attrition,  where  the  rate  of  change  of  the  force  level  of  one  side  was  a  function 
of  the  friendly  force  level  and  the  opposing  force  level.  The  form  of  the  function 
depended  on  the  type  of  combat.  The  solutions  to  these  differential  equations  have 
been  used  extensively  in  military  modeling  [Taylor,  1981]  [Taylor,  1983]. 

These  Lanchester  Equations  are  recognized  to  have  several  shortcomings  [Hughes, 
1964]  [Dupuy,  1987]  [Ventisel,  1964].  They  are  deterministic,  simplistic,  and  do  not 


114 


CHAPTER  7.  COMBAT  MODELS 


115 


fit  the  historical  record  well.  Still,  they  are  widely  used  because  they  are  easily  un¬ 
derstood  and  may  give  insights  despite  their  weaknesses.  The  Corps  level  model 
Vector-in-Commander  used  by  the  Army  is  a  deterministic  model  based  on  Lanch- 
ester  Equations. 

A  second  approach  has  arisen  with  the  advent  of  powerful  computers.  This 
is  the  high  resolution  simulation.  In  these  models,  every  actor  on  the  battlefield  is 
modeled.  Then  a  stochastic  simulation  is  run,  and  the  results  reported.  For  example, 
one  soldier  may  be  set  in  motion  towards  an  object.  If  he  makes  contact  with  an 
enemy  soldier,  there  is  a  conflict  resolution  according  to  some  stochastic  algorithm. 
And  the  game  proceeds  according  to  the  results.  A  large  number  of  actors  and  a  large 
number  of  conflicts  produce  a  very  large  space  of  possible  outcomes  of  the  simulation. 

Some  of  these  simulations  are  interactive,  with  humans  making  decisions  at 
appropriate  points.  Others  are  not;  they  run  as  programmed  until  some  stopping 
criterion  is  met. 

Unfortunately,  there  are  serious  questions  about  the  utility  of  these  large  scale 
simulations.  [Dupuy,  1987] 

First,  they  are  only  as  good  as  the  underlying  algorithms,  and  in  many  cases 
the  algorithms  rely  on  either  Lanchester  Equations  (which  are  known  to  be  flawed) 
or  on  Monte-Carlo  models  with  the  parameters  estimated  on  an  ad  hoc  basis.  These 
assumptions  are  usually  invisible  to  the  user  of  the  simulation,  and  tend  to  become 
obscured  and  lost  even  to  those  who  are  responsible  to  maintain  and  improve  the 
simulations.  This  is  not  due  to  a  lack  of  diligence  or  professionalism,  but  is  a  result 
of  the  sheer  size  of  the  programs,  the  turn-over  of  personnel,  and  the  fundamental 
lack  of  good  algorithms. 

Second,  these  models  also  do  not  have  good  records  replicating  the  historical 

record. 

A  third  approach  to  modeling  combat  has  been  taken  by  those  who  have 
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attempted  to  construct  statistical  models  based  on  the  historical  record.  Robert  L. 
Helmbold  offers  a  survey  of  these  results  [Helmbold,  1990],  Despite  the  best  efforts  of 
many  talented  people,  these  statistical  models  have  failed  to  explain  the  variation  in 
the  overall  historical  record  well,  with  the  best  regression  models  enjoying  «  .3. 

These  issues  are  not  just  of  interest  to  the  military  academic.  Procurement, 
doctrine,  and  force  structure  decisions  are  being  made  on  the  basis  of  the  results  of 
military  models.  This  is  a  trend  that  will  continue  and  accelerate,  as  the  cost  of 
conducting  analysis  based  on  physical  models  is  prohibitive.  Sound  stewardship  of 
national  resources  as  well  as  prudence  in  the  conduct  of  the  national  defense  make  it 
imperative  that  the  models  used  be  correct  as  possible. 

7.1.1  Underlying  hypothesis  of  Brownian  motion  for  combat 
models 

Combat  operations  are  inherently  stochastic.  This  nature  argues  in  favor  of  deeper 
models  than  the  Lanchester  differential  equations,  which  at  best  can  be  considered 
models  of  the  expected  results  of  combat  and  completely  fail  to  capture  the  distribu¬ 
tion  of  results  that  may  occur. 

It  is  possible  to  cast  these  equations  as  stochastic  differential  equations  (SDEs), 
and  attempt  to  solve  them.  Hov/ever,  complicated  situations  with  many  mixtures  of 
players  argue  that  such  stochastic  differential  equations  would  be  very  complicated 
to  solve.  Additionally,  the  coefficients  of  such  SDEs  would  need  to  be  fit  or  estimated 
from  either  historical  data  or  some  other  modeling  effort. 

In  the  remainder  of  this  chapter,  we  make  a  large  assumption.  Without  at¬ 
tempting  to  explicitly  define  the  SDEs,  their  coefficients,  or  their  number  and  type, 
we  will  assume  that  their  solution  is  well  modeled  by  Brownian  motion  with  drift. 

This  is  not  such  a  huge  assumption  as  it  might  appear.  First,  on  its  face  it 
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appears  a  reasonable  model  of  the  physical  nature  of  combat  operations.  Many  indi¬ 
vidual  actions  occur  as  a  force  moves  toward  an  objective,  some  favorable,  and  others 
unfavorable.  The  sum  of  these  actions  constitutes  the  net  effort  of  the  larger  unit. 
It  is  not  unreasonable  to  model  this  ebb  and  flow  of  battle  as  Brownian  motion  with 
drift.  This  seems  especially  unobjectionable  when  the  characteristic  we  are  modeling 
is  actual  movement  of  a  unit.  Attrition  models  require  a  stronger  assumption,  since 
attrition  tends  to  accelerate  as  one  side  gains  ascendancy. 

Second,  this  type  of  model  captures  the  benefits  of  the  simplicity  of  the  differ¬ 
ential  equation  approach  while  retaining  the  distributional  nature  of  the  simulation 
approach.  It  also  is  much  less  computationally  intensive  than  the  simulation  effort. 

Third,  it  appears  partially  supported  by  the  historical  record.  Helmbold  [1990] 
looked  at  advance  rates  for  634  battles.  He  concluded: 

The  upshot  of  our  analysis  is  the  advance  rates  are  not  normally  dis¬ 
tributed.  Their  distribution  is  highly  skewed,  and  much  more  closely  fit 
by  lognormal  distributions  than  by  any  of  the  others  tried  (normal,  expo¬ 
nential,  Weibull,  and  gamma.) [Helmbold,  1990] 

Helmbold  did  not  try  the  inverse  gaussian  distribution.  We  know  that  data  well  fit 
by  the  log-normal  distribution  is  usually  also  well  fit  by  the  inverse  gaussian. 

We  examine  the  log-normal  probability  plot  in  Figure  3-1  of  Helmbold.  This  is 
a  plot  of  the  advance  rates  observed  in  57  battles  in  the  Italian  Theater  during  W .  W .11 
between  the  fall  of  1942  and  spring  of  1944.  We  notice  that  the  data  does  appear  to 
be  well  fit  by  the  log-normal  except  at  the  tails,  where  there  is  a  slight  S  shape.  The 
exact  same  tail  behavior  is  demonstrated  by  the  graph  of  1000  ln(/G(3,  5))  variates 
when  similarly  plotted  in  Figure  7.1.  In  other  words,  it  is  very  plausible  that  the  data 
from  Helmbold  is  better  fit  by  the  inverse  gaussian  model  than  by  the  log-normal 
model.  Without  access  to  the  original  data,  this  is  as  compelling  an  argument  as  can 
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Figure  7.1:  Example  of  the  tail  behavior  when  an  IG  random  variable  is  plotted  on 
a  log-normal  probability  plot. 
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be  made  from  the  graphical  evidence  available.  We  are  encouraged  to  continue  our 
efforts  to  model  advance  rates  with  the  inverse  gaussian  distribution. 

Of  course,  what  we  would  expect  from  the  physical  process  is  that  the  time  to 
accomplish  a  mission  was  distributed  as  an  inverse  gaussian  random  variable.  This 
too  seems  plausible.  If  the  advance  rate  is  well  modeled  as  inverse  gaussian,  then  the 
time  to  move  a  fixed  distance  is  distributed  as  the  reciprocal  of  an  inverse  gaussian. 
Since  the  distribution  of  the  reciprocal  of  an  inverse  gaussian  random  variable  is 
known  and  similar  in  form  to  an  inverse  gaussian,  we  proceed  to  model  the  movement 
itself  as  having  an  inverse  gaussian  distribution.  We  note  that  if  this  was  true,  then 
the  log-normal  probabihty  plot  of  the  advance  rate  would  appear  as  in  Figure  7.2, 
which  again  matches  the  figure  in  Helmbold  and  is  similar  to  Figure  7.1. 

A  separate  issue  raises  itself,  and  we  defer  it  for  future  study.  If  the  move¬ 
ment  of  forces  appears  to  follow  an  inverse  gaussian  distribution,  perhaps  regression 
methods  of  explaining  combat  based  on  the  inverse  gaussian  distribution,  instead  of 
the  normal  distribution,  may  be  successful  for  statistical  modeling  of  combat. 

7.1.2  Program  maintenance  and  unintended  effects 

The  computer  simulations  used  to  model  combat  are  very  large  programs.  The  pro¬ 
grams  are  constantly  being  maintained,  as  new  algorithms  are  added,  new  weapons 
systems  are  modeled,  and  new  scenarios  imagined.  Historical  data  also  causes  model 
changes  and  updates,  as  when  the  experience  of  the  Gulf  War  did  not  match  the 
predictions  of  the  simulations  run  during  the  planning  phases. 

The  major  Army  proponent  for  these  models  is  located  at  the  Training  and 
Doctrine  Command  Analysis  Center  (TRAC)  activity  at  the  White  Sands  Missile 
Range  (WSMR).  There,  analysts,  modelers,  and.  programmers  continually  change  the 
program  code.  To  test  for  unintended,  programming  effects,  TRAC-WSMR  has  a 
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Figure  7.2:  Example  of  the  tail  behavior  when  the  reciprocal  of  an  /G  random  variable 
is  plotted  on  a  log-normal  probability  plot. 
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suite  of  scenarios  it  runs  after  updating  the  programs  to  check  for  consistency  with 
earlier  versions  of  the  software. 

There  are  many  measures  associated  with  each  model  run.  For  example,  one 
can  measure  friendly  losses,  enemy  losses,  the  ratio  of  friendly  to  enemy  losses,  and 
so  on.  If  we  represent  the  number  of  friendly  forces  by  X  and  the  number  of  enemy 
forces  by  Y,  then  ^  is  the  Loss  Exchange  Ratio.  This  measure  is  widely  used,  and 
the  one  we  examine  in  the  remainder  of  this  chapter. 

We  defer  examination  of  other  measures  for  later  work  for  two  reasons.  First, 
additional  post-processing  of  simulation  runs  is  expensive,  and  we  are  already  in¬ 
debted  to  the  Rand  Corporation  and  to  TRAC- White  Sands  Missile  Range  for  the 
extensive  work  they  did  to  produce  data  on  the  loss  exchange  ratios.  Second,  the 
loss  exchange  ratio  is  a  widely  used  measure  for  modelers,  and  we  shall  see  below 
that  it  appears  to  be  well-modeled  by  the  inverse  gaussian  distribution.  This  lends 
credibility  and  practical  significance  to  the  work,  which  would  not  be  obtained  by 
working  with  a  less  familiar  measure. 

We  have  proposed  to  TRAC-WSMR  and  to  the  Army  Research  Laboratory 
that  the  time  to  accomplish  the  mission  and  the  time  to  make  decisions  be  examined 
to  determine  if  the  inverse  gaussian  distribution  is  an  appropriate  model.  The  Army 
Research  Laboratory  has  agreed  to  fund  the  author’s  investigation  of  those  topics. 
That  research  will  be  conducted  in  the  summer  of  1996  in  conjunction  with  a  large 
command  and  control  exercise  at  Fort  Leavenworth,  Kansas.  We  feel  that  those 
measures  have  a  better  theoretical  basis  for  being  modeled  as  inverse  gaussian  variates. 

7.2  Goodness  of  fit 

We  have  as  our  initial  data  80  replications  of  a  simulation  conducted  at  Fort  Hood, 
Texas,  in  December  1993.  These  trials  were  used  to  measure  the  effectiveness  of  the 
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Figure  7.3:  A  histogram  of  force  exchange  ratios  from  the  Blue  Defense  vignette  for 
the  M1A2  lOTE.  The  fitted  IG  distribution  has  been  superimposed. 

M1A2  main  battle  tank.  The  trials  consisted  of  a  Blue  armor  task  force  defending 
against  a  Red  force.  We  were  provided  the  friendly  losses  (“Blue  losses”),  the  enemy 
losses  (“Red  losses”),  and  the  resulting  loss  exchange  ratio. 

This  model  represents  damage  to  each  vehicle  as  a  vector,  representing  of 
damage  to  different  sub-systems.  The  model  changes  the  allowed  behavior  of  each 
vehicle,  based  on  its  damage  vector.  Different  ways  of  representing  this  allowed 
behavior  have  been  coded  into  the  model,  8.nd  simulated.  These  different  approaches 
constitute  our  possible  out-of-control  observations. 
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The  MLEs  for  the  parameters  for  an  inverse  gaussian  model  of  the  base  data 
are  ft  =  1.697875  and  A  =  11. 2150895.  A  histogram  of  the  data  with  the  fitted  density 
is  provided  at  Figure  7.3  and  indicates  a  good  fit.  We  pursue  goodness  of  fit  testing 
next. 

7.2.1  Goodness  of  fit  of  IG  model 

Following  Edgeman,  Scott  and  Pavur  [1988],  we  perform  a  modified  Kolmogorov- 
Smirnov  test  for  the  goodness  of  fit  of  the  inverse  gaussian  distribution.  The  Kolmogorov- 
Smirnov  statistic  =  0.0982.,  We  further  adjust  this  value  using  the  regression 
equation  obtaining  =  1.2753. 

The  critical  value  for  rejecting  the  hypothesis  that  the  data  is  well  fit  by  an 
inverse  gaussian  distribution  at  the  .20  significance  level  is  greater  than  1.994,  using 
Table  I  in  Edgeman,  Scott  and  Pavur.  The  critical  value  at  the  .10  significance  level 
is  at  least  2.356.  Accordingly,  we  find  no  evidence  that  the  data  is  not  well  fit  by  an 
inverse  gaussian  distribution. 

7.3  Results 

We  will  now  use  the  procedures  in  this  thesis  to  detect  changes  to  the  underlying 
model.  We  have  three  possible  out-of-control  scenarios. 

The  first  one  represents  a  large  model  change.  We  will  attempt  to  detect  these 
changes  with  our  self-starting  and  predictive  Shewhart  charts. 

The  second  two  scenarios  correspond  to  small  persistent  model  changes.  These 
are  the  ones  least  hkely  to  be  noticed  after  program  maintenance.  For  these  we  will 
use  self-starting  CD SUM  charts. 

We  could,  of  course,  apply  all  methods  to  all  cases,  but  we  have  chosen  to 
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limit  ourselves. 

We  assume  the  base  case  corresponds  to  the  process  in-control.  Since  we  have 
less  than  100  in-control  individual  observations,  we  will  use  self-starting  and  predictive 
methods. 

The  first  case  we  examine  is  a  previous  version  of  this  computer  combat  model. 
The  old  methodology  assigns  a  one-dimensional  utility  number  to  a  combat  vehicle. 
This  number  represents  the  fraction  of  capability  left  in  the  vehicle.  It  does  not 
distinguish  between  loss  of  capability  due  to  loss  of  mobility  or  that  due  to  loss  of  a 
weapon  system.  We  expect  this  model  to  perform  differently  from  the  base  case.  We 
have  20  observations  for  this  case. 

The  second  two  cases  are  modifications  to  the  base  case.  The  second  case 
allows  only  two  states  for  each  subsystem:  operable  or  inoperable.  The  third  case 
allows  the  utility  for  each  subsystem  to  be  any  value  in  the  interval  [0, 1].  This 
contrasts  with  the  base  case,  which  has  a  finite  number  or  degraded  utility  states. 
We  have  80  observations  of  each  of  these  cases. 

Figure  7.4  shows  a  boxplot  of  the  four  data  sets,  with  the  base  case  at  the  left, 
then  the  old  program,  and  the  two  further  modifications. 

Each  of  the  four  models  tracks  damage  to  the  vehicles  differently,  and  feeds 
that  information  into  the  main  battle  simulation.  We  note  from  Figure  7.4  that  the 
old  model  is  obviously  different  from  the  other  three,  but  that  the  two  modifications 
do  not  look  very  different  from  the  base  case. 

Normally,  we  would  not  use  control  charts  in  this  scenario,  but  rather  tests  of 
equality  of  parameters.  This  is  especially  true  because  we  know  precisely  when  the 
possible  model  changes  occur.  We  use  these  four  cases  to  illustrate  a  more  sophisti¬ 
cated  process.  Over  the  life  of  this  code,  there  are  literally  hundreds  of  changes  in 
the  details  of  the  implementation.  The  four  described  above  are  not  very  significant 
programming  changes,  and  refer  to  one  very  small  piece  of  the  code.  How  does  one 
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Figure  7.4:  Boxplot  of  the  four  data  sets  used  in  this  chapter.  Prom  left  to  right, 
they  are  the  base  case  (assumed  in-control),  the  old  program,  and  two  modifications 
to  the  base  case.  Source:  White  Sands  Missile  Range,  1996. 
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monitor  the  overall  performance  of  the  code  when  there  are  continual  changes,  most 
of  which  are  not  supposed  to  produce  any  change  in  the  output  characteristics?  In 
this  setting,  control  charts  are  useful  to  track  the  model  over  time,  when  it  is  unknown 
when  a  substantial  change  to  the  model  may  occur. 

There  is  no  history  of  statistical  testing  of  equivalency  of  models  after  pro¬ 
gramming  changes  at  WSMR,  according  to  the  sources  who  provided  this  data.  We 
hope  that  this  methodology  will  be  adopted  by  WSMR,  and  we  have  reason  to  believe 
that  it  will  be. 

The  concept  of  charting  computer  program  performance  against  a  set  of  bench¬ 
marks  during  program  modifications  is  useful  in  more  contexts  than  just  this  one 
military  application.  It  allows  a  holistic  view  of  software  maintenance  over  time,  in 
any  context. 

7.3.1  Self  starting  Shewhart  Charts 

Here  we  compare  the  base  case  with  the  old  model,  using  a  self-starting  Shewhart 
scheme  for  the  mean.  The  chart  is  at  Figure  7.5.  We  see  that  even  with  this  relatively 
large  model  change,  we  do  not  get  a  signal  in  20  observations,  although  there  is  an 
obvious  downward  trend. 

If  we  break  the  data  into  samples  of  size  five,  there  is  also  no  indication  of 
a  shape  change  on  the  self-starting  Shewhart  chart  for  shape,  found  in  Figure  7.3.1. 
Perhaps  the  lack  of  evidence  is  due  to  the  process  not  running  long  enough.  To  check 
this,  we  sample  from  the  in-control  data  200  times  to  produce  a  new  “base  case”. 
We  then  check  the  self-starting  charts  for  this  data,  again  with  just  our  original  20 
out-of-control  points. 

We  obtain  an  immediate  signal  on  our  self-starting  Shewhart  chart  for  with 
this  strategy,  as  seen  in  Figure  7.3.1.  We  obtain  no  signal  in  our  self-starting  chart 
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Figure  7.5:  A  self-starting  Shewhart  chart  for  the  mean  of  the  base  case  and  historical 
data. 
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Figure  7.6:  Self-starting  Shewhart  chart  for  A.  Data  is  treated  as  samples  of  size  5. 
No  evidence  of  a  shape  change  is  found. 
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for  A. 

We  see  no  conclusive  evidence  in  our  original  self-starting  Shewhart  charts  of 
a  model  shift  in  either  fj,  or  A.  We  do  see  evidence  of  a  shift  in  the  mean  in  our 
“bootstrap”  self-starting  Shewhart  chart  for  the  mean. 

7.3.2  Predictive  Shewhart  Charts 

We  construct  the  predictive  chart  for  the  base  case  followed  by  the  large  shift  in  Fig¬ 
ure  7.8.  Just  as  the  self-starting  chart,  the  predictive  chart  does  not  signal,  although 
it  does  indicate  a  downward  trend  in  the  data. 

We  increase  the  size  of  the  training  set  by  bootstrapping  the  base  case  to  200 
points,  and  construct  at  Figure  7.9  the  predictive  chart  for  the  data.  We  see  that, 
unlike  the  self-starting  chart,  the  predictive  chart  does  not  signal  immediately.  There 
are  several  interesting  features  to  this  chart.  First,  there  are  7  out-of-control  points 
in  the  first  200  observations.  With  an  ARL  of  100,  we  expect  two  out-of-control 
points.  This  indicates  that  the  process  may  not  be  in  control.  Since  these  points  are 
normalized  from  the  predictive  scheme,  that  indicates  that  the  process  may  not  be 
modeled  well  by  the  inverse-gaussian.  This  is  interesting,  and  argues  for  more  precise 
distributional  testing  of  the  White  Sands  data. 

7.4  Self-starting  CUSUM  charts  for  the  mean 

We  apply  the  self-starting  CUSUM  schemes  to  the  comparison  of  the  base  case  with 
all  three  model  changes.  The  charts  are  displayed  in  Figures  7.10,  7.11,  and  7.12. 

We  see  that  the  self-starting  charts  detect  the  mean  shift  quickly  for  the  shift 
from  the  base  case  to  the  historical  case,  and  relatively  quickly  for  the  shift  from  the 
base  case  to  the  other  two  degraded  states  models. 
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Figure  7.7:  Self-starting  Shewhart  chart  for  n.  Training  set  has  been  redefined  to 
200  observations  by  sampling  with  replacement  from  the  original  80  base  case  points. 
The  chart  signals  immediately  when  the  data  from  the  historical  model  is  charted. 
Note  that  the  points  are  labeled  from  0  upward,  and  the  point  plotted  as  observation 
zero  is  observation  3  compared  with  observations  1  and  2  from  the  data  stream.  The 
signal  at  point  200  corresponds  to  the  3rd  point  from  the  historical  data. 
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Figure  7.8:  Predictive  chart  for  the  base  case  followed  by  the  historical  data.  While 
the  chart  does  not  signal,  evidence  indicates  a  downward  trend  after  observation  80. 
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Figure  7.9:  A  predictive  Shewhart  chart  with  bootstrapping  to  increase  the  training 
set  to  200  points  from  80.  Note  the  first  two  observations  are  not  plotted  (since  we 
need  at  least  two  points  to  predict  the  next  observation)  and  points  that  are  plotted 
are  numbered  with  the  first  point  being  0,  as  is  the  Xlisp-Stat  convention.  The  bold 
points  at  the  right  of  the  chart  are  the  observations  from  the  historical  model,  different 
from  the  base  case.  There  are  several  interesting  aspects  to  this  chart,  including  the 
unexpectedly  high  number  of  out-of-control  points.  Those  out-of-control  points  are 
also  highlighted. 
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Figure  7.10:  Self-starting  CUSUM  of  base  case  and  historical  case.  ARL  =  250.  Note 
the  chart  signals  at  observation  84,  four  observations  after  going  out  of  control. 
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Figure  7.11:  Self-starting  CUSUM  of  base  case  and  first  model  change.  ARL  =  250. 
Note  the  chart  signals  at  observation  114. 
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Figure  7.12:  Self-starting  CUSUM  of  base  ca^e  and  second  model  change.  ARL  =  250. 
Note  this  chart  also  signals  at  observation  114. 
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7.5  Conclusions 

We  have  seen  that  the  methods  of  this  thesis  detect  changes  in  the  output  of  combat 
simulation  models.  This  is  significant  for  three  reasons. 

First,  we  have  modeled  the  output  of  these  combat  models  as  inverse-gaussian 
variates.  This  is  a  novel  modeling  strategy,  and  offers  promise  for  fresh  insight  into 
the  behavior  of  these  combat  simulations.  It  allows  a  new  approach  to  comparing  the 
output  of  different  versions  of  the  computer  code  based  on  statistical  methodology,  as 
the  sampling  theory  of  inverse  gaussian  variates  is  well  known.  In  particular,  it  offers 
a  way  to  determine,  before  simulation  begins,  an  appropriate  number  of  simulation 
runs,  based  on  power  considerations.  This  would  be  a  vast  improvement  over  the 
current  approach  of  running  30  simulations  due  to  an  appeal  to  the  central  limit 
theorem. 

Second,  we  have  applied  control  chart  methodology  to  this  new  modehng  strat¬ 
egy.  This  holds  promise  for  quafity  control  of  large  computer  programs  of  any  sort 
which  are  subject  to  continual  revision. 

Last,  we  have  shown  how  the  tools  of  this  thesis  can  be  used  in  a  practical 
setting.  Our  control  chart  methods,  based  on  the  inverse  gaussian  distribution,  apply 
in  this  context  where  other  methods  based  on  other  distributions  would  have  been 
inappropriate. 


Chapter  8 

Application  to  automobile 
manufacturing:  General  Motors 


This  chapter  examines  another  application  of  the  inverse  gaussian  distribution  as 
a  model  for  task  completion  time.  The  example  comes  from  an  article  in  Applied 
Statistics,  by  A.  F.  Desmond  and  G.  R.  Chapman  [1993],  We  again  demonstrate  how 
the  tools  developed  in  this  thesis  can  be  useful  to  both  the  industrial  statistician  and 
quality  controller. 


8.1  Background 

In  their  paper,  Desmond  and  Chapman  examined  the  time  to  complete  a  task  by  crews 
of  workers  at  the  General  Motors  plan  in  Oshawa,  Ontario.  The  crews  performed 
repetitive  tasks  on  an  assembly  hne.  The  facility  in  which  they  worked  electronically 
monitored  “all  aspects  of  the  operation  of  the  plant.”  [Desmond  and  Chapman,  1993]. 
Desmond  and  Chapman  looked  at  three  processes,  of  which  we  consider  one.  The 
thrust  of  Desmond  and  Chapman’s  paper  was  modehng  these  task  completion  times 
with  mixtures  of  inverse  gaussian  distributions.  The  third  process  “exhibits  no  mixing 
whatsoever.”  This  is  the  process  we  will  study  in  this  chapter. 

We  examine  the  station  where  the  parts  required  to  install  a  radio  were  assem¬ 
bled.  Desmond  and  Chapman  [1993]  describe  this  “radio  kitting  station”  as  the  place 
where  “  a  single  worker  performs  the  simple  task  of  reading  a  docket  indicating  what 
parts  are  required,  taking  the  parts  from  the  appropriate  bins,  and  placing  them  into 
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a  tray.” 

For  reasons  very  close  to  the  ones  addressed  in  our  introductory  chapter, 
Desmond  and  Chapman  chose  to  model  the  task  completion  time  using  inverse  gaus- 
sian  random  variables,  instead  of  using  Weibull,  gamma,  lognormal  or  other  skewed 
non-negative  distributions. 

Those  authors  also  caveat  their  data  with  the  warning  that  the  reported  times 
were  subject  to  variation  from  actual  time,  due  to  operator  discretion  in  the  opera¬ 
tion  of  the  mechanism  which  signaled  task  completion,  and  other  sources.  General 
Motors  pre-processed  the  data  prior  to  releasing  it  to  Desmond  and  Chapman.  The 
completion  times  were  sorted;  accordingly,  no  time-series  analysis  is  directly  applica¬ 
ble.  Additionally,  Desmond  and  Chapman  screened  for  what  they  considered  outliers 
(they  termed  them  “bogus  readings”)  and  removed  them. 

What  was  reported  in  the  article  were  the  MLEs  from  1955  observations  of  a 
process  claimed  to  be  well  modeled  by  an  inverse  gaussian  distribution.  The  authors 
offered  to  make  the  data  itself  available  for  examination,  but  had  not  forwarded  it 
as  of  the  date  this  was  written.  Accordingly,  we  accept  their  claim,  and  reserve  for 
future  work  emy  additional  goodness  of  fit  testing. 

The  task  completion  times  are  modeled  as  inverse  gaussian  random  variables, 
with  fi  -  42.6257  and  A  =  66.282.  We  note  the  large  reported  confidence  interval  for 
these  estimates  {(i  G  (41.0678,44.1501)).  The  units  of  time  were  not  specified  in  the 
paper,  but  we  assume  (from  the  context)  that  they  are  in  seconds.  Figure  8.1  shows 
the  density  of  this  distribution.  We  note  the  very  heavy  tail  to  the  right.  The  median 
for  this  model  is  32.5051,  and  the  mode  is  18.1071. 

We  note  that  conventional  quality  control  charts  based  on  the  normal  dis¬ 
tribution  are  inappropriate  here.  Because  of  the  heavy  tail  to  the  right,  charts  for 
centrality  will  never  behave  as  expected,  signaling  too  frequently.  This  situation  ar¬ 
gues  for  inverse  gaussian  charts,  and  illustrates  that  they  are  not  of  only  theoretical 
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interest. 

What  features  are  of  interest  to  those  monitoring  this  process?  As  this  is  a 
station  in  an  assembly  line,  steady  flow  is  desirable.  Increases  in  the  mean  processing 
time  or  a  decrease  in  A,  or  both,  will  decrease  the  service  rate  at  this  station  and 
can  cause  slowdown  in  the  overall  assembly  process.  On  the  other  hand,  decreases 
in  and  increases  in  A  allow  management  to  identify  circumstances  which  reflect 
improved  service  rates  and  decreased  variation  in  the  process. 

We  noted  in  earlier  chapters  that  the  HPD  test  was  more  effective  at  detecting 
increases  in  /i  than  the  (corrected)  UMPU  test  of  Edgeman.  Here  is  an  instance  where 
that  appears  to  be  advantageous. 

8.2  Results 

In  the  last  chapter,  we  used  the  self-starting  and  predictive  schemes  because  we  had 
time-series  data,  and  startup  conditions.  We  did  not  use  those  means  that  assumed  we 
had  accurate  process  knowledge,  namely  the  corrected  Edgeman  Shewhart  scheme, 
the  HPD  schemes,  and  the  standard  CUSUM.  For  the  General  Motors  example  of 
this  chapter,  in  contrast,  we  do  not  have  time  series  data  available.  However,  we 
have  extensive  process  history  (1955  observations)  from  which  to  estimate  our  MLEs. 
Accordingly,  we  will  not  use  the  self-starting  or  predictive  schemes  in  this  context. 
Rather,  we  will  use  the  corrected  Edgeman  scheme,  the  standard  CUSUM,  and  the 
HPD  charts. 

Between  this  chapter  and  the  previous  chapter,  we  will  have  illustrated  each 
of  the  tools  developed  in  this  thesis. 

We  set  the  in-control  ARL  equal  to  100  for  each  of  the  calculations  that  follow. 
We  will  use  samples  of  size  1,  unless  otherwise  indicated,  to  facilitate  comparison 
between  the  CUSUM  and  Shewhart  schemes. 
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Table  8.1:  Table  of  ARLs  for  corrected  Edgeman  Shewhart  control  scheme  for  samples 
of  size  1  from  an  /G(^,  66.282)  distribution. 


ARLs  out  of  control 

& 

ARL 

28.8038 

25 

46.8314 

30 

66.1206 

35 

84.6923 

40 

97.8671 

42.6257 

100 

45 

98.2058 

50 

84.8686 

60 

50.5315 

70 

30.4222 

80 

20.4493 

8.2.1  Shewhart  Chart  for  the  mean 

We  find,  using  Equations  2.8  and  2.9,  that  our  control  limits  for  process  central¬ 
ity  are  given  by  6.98446  <  X  <  260.141.  In  control,  that  gives  an  ARL  of  100. 
Straightforward  integration  gives  the  results  in  Table  8.1. 

We  see  that  even  for  large  shifts  in  /i,  this  scheme  is  relatively  slow  to  detect 
the  process  changes. 

8.2.2  Shewhart  chart  for  A 

We  mentioned  in  Chapter  2  that  the  Shewhart  chart  for  changes  in  A  behaved  unex¬ 
pectedly.  We  will  develop  that  assertion  now. 

For  this  example,  we  consider  samples  of  size  5.  We  also  consider  A  known 
from  historical  data,  and  equal  to  66.282.  We  know  that  XV  ~  x%  and  we  construct 
our  control  limits  as  Equations  2.13  and  2.14  direct. 

We  find  LCL  =  .206987  and  UCL  =  14.8602.  Those  limits  give  an  ARL  in 
control  of  100,  and  out-of-control  ARLs  as  presented  in  Table  8.2. 

We  note  that  the  scheme  is  not  effective  for  detecting  upward  shifts  in  A. 
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Table  8.2;  Table  of  ARLs  for  the  corrected  Edgeman  Shewhart  scheme  for  A  for 
an  in-control  distribution  of  /G(42, 6257, 66.282).  The  scheme  is  reasonably  effective 
detecting  decreases  in  A,  but  is  not  effective  detecting  upward  shifts. 


ARLs  out  of  control 

A 

ARL 

10 

1.44617 

20 

2.89911 

30 

6.57186 

40 

15.6796 

50 

36.7708 

60 

74.6842 

66.282 

100 

70 

110.963 

80 

118.352 

90 

105.769 

100 

89.6385 

110 

75.6325 

120 

64.4051 

130 

55.4939 

140 

48.3559 

150 

42.5583 

This  results  from  both  the  skewness  of  the  distribution  of  the  random  variable 
and  the  nature  of  the  test.  An  upward  shift  in  A  corresponds  to  a  scale  change 
in  the  distribution  of  V,  resulting  in  compression  of  the  distribution  of  V  towards 
zero.  The  skewness  of  the  distribution  means  that  the  such  compression  adds  more 
probability  on  the  right  hand  side  of  the  control  chart,  following  such  a  shift,  than 
is  lost  on  the  left  hand  side.  Consider  a  shift  from  A  =  66.282  to  A^  =  80,  an 
increase  by  a  factor  of  about  1.2.  With  Ao  =  80,  P{\V  <  .206987)  =  .0071840,  while 
P{XV  >  14.8602)  =  .0012730.  In  other  words,  the  probability  of  being  signaled  as 
out-of-control  has  dropped  to  0.00845170  from  0.01000,  or  about  16%.  This  results 
in  the  increase  of  the  ARL  to  118.352.  Figure  8.2  illustrates.  Since  this  chart  is 
essentially  the  same  chart  for  a  scale  change  in  a  normal  distribution,  which  is  given 
by  the  limits 

v2  n-2  v2  0-2 

hCl,  — 


n  —  1 


n  —  1 


=  UCL 
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Figure  8.2:  This  graph  shows  the  pdf  of  the  sampling  distribution  of  V  when  in 
control  (the  lower  curve,  A  =  66.282)  and  when  A  has  increased  to  80.  Note  that  the 
increase  in  A  has  resulted  in  a  compression  of  the  density  towards  the  origin. 
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it  appears  that  using  charts  would  suffer  from  the  same  defect. 

While  this  scheme  does  not  detect  increases  in  A  well,  in  this  context  that 
is  not  necessarily  a  fatal  defect.  Recall  that  the  variance  of  an  IG  random  variable 
is  inversely  proportional  to  A.  Increases  in  A  result  in  decreased  process  variability, 
which  is  desirable.  It  is  less  critical  to  detect  improvements  quickly.  On  the  other 
hand,  the  scheme  does  a  reasonably  good  job  of  detecting  shifts  in  A  which  result 
in  increased  variability,  and  it  is  increased  variability  which  will  wreak  havoc  with  a 
queuing  system.  If  the  scheme  has  to  be  weak  in  one  direction,  it  at  least  is  weak  in 
the  least  dangerous  direction. 

8.2.3  HPD  chart  for  the  mean 

Calculating  the  HPD  limits  for  samples  of  size  one,  we  obtain  LCL  =  3.05051 
and  UCL  =  171.599.  We  compare  these  limits  with  the  corrected  Edgeman  lim¬ 
its  (6.98446,  260.141)  and  see  by  inspection  that  the  HPD  test  will  be  quicker  to 
detect  upward  shifts  in  the  mean,  and  slower  to  detect  downward  shifts  in  the  mean. 
Again,  in  this  context,  that  is  desirable  behavior.  We  illustrate  with  Table  8.3.  This 
example,  when  compared  with  the  earlier  corrected  Edgeman  example,  highlights  that 
the  HPD  test  is  not  the  uniformly  most  powerful  test.  However,  since  we  are  more 
interested  in  detecting  increases  in  the  mean  than  in  detecting  decreases,  that  feature 
is  not  unattractive.  Yet  compared  to  a  one-sided  test,  it  still  can  detect  downward 
shifts  in  the  mean,  albeit  more  slowly. 

8.2.4  HPD  chart  for  A 

We  find  that  the  HPD  limits  for  V  are  given  by  LCL  =  .017469  and  UCL  =  13.2854. 
Since  these  limits  are  to  the  left  of  the  Edgeman  limits,  we  expect  that  we  will  again 
see  the  HPD  scheme  to  be  more  sensitive  to  one  side  than  the  corrected  Edgeman 
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Table  8.3:  Table  of  ARLs  for  HPD  scheme  for  the  mean  of  an  /G(/x,  66.282)  distribu¬ 
tion  with  samples  of  size  1.  We  see  that  the  test  detects  upward  shifts  much  better 
than  it  detects  downward  shifts. 


AKLs  out  of  control 

TO 

1U64:88~ 

20 

14490 

30 

2115 

40 

152.205 

42.6257 

100 

50 

42.2147 

60 

20.4234 

70 

12.9217 

80 

9.46838 

90 

7.5729 

100 

6.40606 

110 

5.62765 

120 

5.07682 

charts.  We  present  Table  8.4  to  illustrate. 

8.2.5  CUSUM  chart  for  the  mean 

We  now  turn  our  attention  to  CUSUM  charts  for  the  mean.  We  construct  a  table  of 
ARLs  for  various  shifts  in  the  mean,  and  present  it  in  Table  8.5.  The  ARLs  presented 
are  for  a  shift  to  the  indicated  out-of-control  state,  with  the  CUSUM  parameters  se¬ 
lected  for  maximum  power  to  detect  changes  to  that  state.  As  expected,  the  CUSUM 
scheme  detects  these  changes  much  more  quickly  than  the  Shewhart  scheme. 

8.2.6  CUSUM  chart  for  A 

We  now  turn  our  attention  to  CUSUM  charts  for  A.  We  construct  a  table  of  ARLs 
for  various  shifts  in  A,  and  present  it  in  Table  8.6.  The  ARLs  presented  are  for  a 
shift  to  the  indicated  out-of-control  state,  with  the  CUSUM  parameters  selected  for 
maximum  power  to  detect  changes  to  that  state.  Compare  Table  8.6  to  Table  8.2, 
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Table  8.4:  Table  of  ARLs  for  HPD  scheme  for  A  of  an  7G(44.6257,  A)  distribution 
with  samples  of  size  1.  We  see  that  the  test  detects  upward  shifts  much  better  than 
it  detects  downward  shifts. 


ARLs  out  of  control 

A 

ID 

1.35924 

20 

2.46350 

30 

5.02293 

40 

10.9235 

50 

24.7391 

60 

57.581 

66.282 

100.0 

70 

136.451 

80 

325.887 

90 

769.017 

100 

1716.42 

110 

3308.50 

120 

4965 

130 

5724 

140 

5591 

150 

5115 

Table  8.5:  Table  of  ARLs  for  CUSUM  for  the  mean,  GM  example.  We  use  an  in¬ 
control  ARL  of  100  for  /i  =  42.6257  and  A  =  66.282. 


ARLs  out  of  control 

AJIL 

20 

b:03 

25 

11.32 

30 

18.83 

35 

33.16 

40 

65.23 

42.6257 

100 

45 

64.42 

50 

34.28 

60 

16.54 

70 

10.86 

80 

8.23 
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Table  8.6:  Table  of  ARLs  for  CUSUM  for  A,  GM  example.  We  use  an  in-control  ARL 
of  100  for  ji  =  42.6257  and  A  =  66.282.  Note  that  this  is  a  CUSUM  of  individual 
observations. 


ARLs  out  of  control 

A 

ARL 

lU 

STOS 

20 

5.79 

30 

10.27 

40 

18.05 

50 

32.52 

60 

62.59 

66.282 

100 

70 

80.82 

80 

52.60 

90 

38.92 

100 

31.06 

110 

26.01 

120 

22.51 

130 

19.95 

140 

18.06 

150 

16.50 

the  table  for  the  corrected  Edgeman  Shewhart  charts  for  A.  The  latter  table  was  for 
samples  of  size  five,  and  it  performed  poorly.  By  contrast,  the  CUSUM  of  individual 
observations  does  not  display  increased  ARLs  for  out-of-control  states  with  increased 
A. 


8.3  Conclusions 

We  have  presented  performance  data  for  our  various  tools,  applied  to  the  General 
Motors  case.  We  have  seen  that  the  CUSUM  schemes  for  /i  and  A  perform  much 
better  than  the  HPD  and  Edgeman  schemes.  We  have  also  seen  that  the  HPD  scheme 
performs  better  than  the  Edgeman  scheme  for  detecting  process  changes  of  interest 
to  the  process  manager,  namely  increases  in  fj,  and  decreases  in  A. 

Finally,  by  presenting  an  assembly  line  application  in  the  automobile  manu¬ 
facturing  industry,  we  hope  to  have  again  shown  that  the  methods  of  this  thesis  are 


CHAPTER  8.  GENERAL  MOTORS  148 

of  practical  as  well  as  theoretical  significance. 


Chapter  9 
Conclusions 


This  short  chapter  summarizes  the  work  of  this  thesis,  and  outlines  further  research 
areas. 


9.1  Summary  and  significance 

The  title  of  this  work  is  “Topics  in  Statistical  process  control” .  In  this  thesis,  we  have 
explored  several  topics  which  found  conunon  application  in  the  setting  of  Chapter  7 
and  Chapter  8. 

We  first  reviewed  the  existing  literature  on  the  statistical  process  control  of 
inverse  gaussian  variates,  due  to  Edgeman.  We  extended  and  modified  his  work  with 
Shewhart  control  schemes. 

We  introduced  self-starting  and  predictive  control  charts,  explored  their  prop¬ 
erties,  and  gave  examples  of  their  use.  As  part  of  that  work,  we  derived  the  predictive 
distribution  for  the  next  m  observations  of  an  inverse  gaussian  variate,  given  the  first 
n  observations. 

We  introduced  HPD  control  regions  for  skewed  distributions,  and  explored 
their  advantages  and  disadvantages. 

We  explored  bivariate  control  charts,  and  gave  new  diagnostic  rules  for  classi¬ 
fying  out  of  control  observations.  We  extended  this  idea  to  bivariate  control  charts 
based  on  HPD  regions. 

We  developed  optimal  CUSUM  schemes  for  the  parameters  of  an  inverse  gaus¬ 
sian  distribution.  As  a  consequence,  we  obtained  optimal  CUSUM  schemes  for  the 
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scale  parameter  of  a  Gamma  variate.  We  developed  numerical  routines  to  find  the 
parameters  of  these  CUSUM  schemes  based  on  both  variance  reduction  simulation 
and  numerical  approximations  to  integral  equations,  using  work  of  Hawkins  for  the 
latter. 

We  found  a  new  application  for  these  tools,  control  of  software  modeling  mili¬ 
tary  combat.  We  modeled  the  output  of  those  simulations  as  inverse  gaussian  variates, 
and  applied  our  new  tools  to  this  data.  We  extended  this  work  to  the  idea  of  control¬ 
ling  complex  software  subject  to  constant  revision. 

Finally,  we  returned  to  an  industrial  setting  for  our  final  application,  and 
showed  that  even  the  much  examined  automotive  assembly  line  could  benefit  from 
the  application  of  the  tools  of  this  thesis. 

In  all  of  the  above  work,  we  wrote  software  to  implement  the  algorithms  we 
developed,  and  used  that  software  in  our  examples. 

9,2  Future  work 

There  are  several  exciting  areas  for  future  research  opened  by  this  thesis. 

First,  what  is  the  distribution  of  the  run  length  of  a  CUSUM  scheme  which 
goes  out  of  control  when  the  S~^  and  S~  are  not  zero?  Since  most  departures  from 
control  occur  when  these  values  are  not  zero,  this  has  practical  significance  as  well 
as  theoretical  interest.  Solution  allows  for  generalizing  the  idea  of  the  Fast  Initial 
Response  CUSUM. 

Second,  how  can  the  diagnostic  tools  for  a  bivariate  chart  be  sharpened? 

Third,  we  believe  that  other  outputs  of  military  combat  simulations  are  better 
modeled  as  inverse  gaussian  variates,  especially  the  time  to  complete  a  mission,  the 
time  to  reach  a  decision,  and  time  to  move  a  certain  distance.  We  wish  to  explore 
this  further,  with  an  eye  to  developing  regression  models  to  explain  these  responses. 
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We  also  believe  that  modeling  these  measures  of  effectiveness  as  inverse  gaussian 
random  variates  allows  rational  principles  of  design  of  experiments  to  be  applied  to 
the  conduct  of  these  simulations,  notably  power  and  sample  size  calculations. 

Fourth,  can  we  elicit  and  apply  informative  prior  distributions  to  these  military 
models?  How  does  that  effect  our  power  and  sample  size? 

Fifth,  can  we  develop  bivariate  CUSUM  schemes  with  diagnostic  tools  similar 
to  the  bivariate  Shewhart  schemes? 

Sixth,  can  we  find  or  develop  efficient  numerical  integration  techniques  for 
higher  dimensional  integrals  involving  indicator  functions? 

Last,  what  other  skewed  process  could  be  well  modeled  by  the  inverse  gaussian 
distribution,  and  benefit  from  the  application  of  these  tools  to  their  control? 

These  questions  will  keep  us  occupied  for  the  near  future. 


Appendix  A 

Statistical  Computing 


This  appendix  discusses  the  various  pieces  of  code  used  to  support  the  work  done  in 
this  dissertation.  We  include  it  for  two  reasons.  First,  it  allows  someone  to  request 
code  to  check  the  results  obtained  in  this  dissertation.  Second,  it  allows  someone  who 
desires  to  apply  the  results  of  this  thesis  to  save  the  labor  of  devising  and  writing 
functionally  equivalent  algorithms. 

Because  of  the  preference  at  the  University  for  XLISP-STAT  [Tierney,  1990], 
much  of  the  code  is  written  in  that  language.  However,  to  take  advantage  of  some 
earlier  results,  other  routines  are  written  in  FORTRAN  using  the  IMSL  routines 
[IMSL,  1989].  We  understand  that  the  computer  scientist  might  be  offended  by  such 
an  eclectic  approach. 

This  appendix  is  divided  into  functional  pieces,  as  indicated  by  the  headings. 


A.l  Generating  IG  variates 

This  Xlisp-Stat  function  is  based  upon  the  algorithm  recommended  by  Chhikara  and 
Folks[1989],  which  in  turn  was  based  on  work  by  Michael  et  al.  [1976]. 

;;;  generate  IG’s  ;;;  uses  algorithm  in  Chhikar a/Folks 
(defun  inv-gauss-rand  (n  mu  lambda)  (let*  ((y  (chisq-rand  n  1))  (xl 
(*  (/  mu  lambda  2)  (+  (*  2  lambda)  (*  mu  y)  (-  ((+  (*  4  lambda  mu  y)  (* 
mu  mu  y  y))  *5)))))  (u  (uniform-rand  n))) 

(dotimes  (i  n)  (setf  (select  xl  i)  (if  (<  (select  u  i)  (/  mu  (+  mu 
(select  xl  i))))  (select  xl  i)  (/  (*  mu  mu)  (select  xl  i)))))  xl  ))  This 
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and  a  number  of  other  Xlisp-Stat  function  are  available  from  the  author  upon  request. 

A.2  CUSUM  ARL  FORTRAN  routines  with  ex¬ 
planation 

We  have  four  classes  of  FORTRAN  routines,  available  from  the  author.  The  first 
generates  tables  of  ARL  vs.  h  for  either  the  optimal  or  the  naive  value  of  k.  We 
have  these  routines  for  upward  and  downward  shifts  of  p  and  A.  There  are  8,  named 
ARLl.f-  ARLS.f. 

The  second  class  of  programs  finds  the  value  of  h  to  obtain  a  given  one-sided 
ARL  for  either  the  optimal  or  naive  value  of  k.  As  before,  we  have  these  routines  for 
upward  and  downward  shifts  of  p  and  A.  There  are  8,  labeled  findl.f-  find8.f 

The  third  class  of  routines  finds  the  cutoff  values  for  HPD  regions,  using 
numerical  integration  routines.  There  is  one  routine  in  this  class  —  findbik.f,  which 
finds  the  value  of  k  such  that  P{f  {x,y)  >  k)  =  p. 

The  fourth  class  of  routines  computes  the  predictive  p  value  for  a  series  of 
observations,  using  the  predictive  distribution.  This  can  be  used  to  run  a  Shewhart 
or  CUSUM  chart  of  p- values.  The  routine  labeled  predl.f  computes  the  p- value  for 
the  next  observation.  The  routine  predm.f  computes  the  p-value  for  the  next  m 
observations. 

A.2.1  CUSARL 

Many  of  these  FORTRAN  routines  are  based  on  a  sub-program  called  CUSARL.f, 
by  Douglas  Hawkins.  That  routine  evaluates  the  average  run  length  for  a  one-sided 
CUSUM  chart  for  an  arbitrary  data  distribution  of  one  parameter.  We  have  made  one 
slight  modification,  allowing  for  a  two-parameter  distribution  to  be  evaluated  (such 
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as  the  /G(/x,  A). 

CUSARL  uses  a  Markov  chain  approximation  to  find  the  ARLs,  with  an  iter¬ 
ative  mesh  technique  which  uses  Richardson  extrapolation  to  incorporate  the  infor¬ 
mation  from  previous  iterations  into  the  current  estimate.  This  algorithm  is  accurate 
and  computationally  efficient. 

We  do  not  reproduce  the  CUSARL  code  here;  it  is  contained  in  the  paper  by 
Hawkins  [1992c]  and  is  also  available  by  ftp  from  the  StatLib  archives  at  Carnegie 
Mellon  University. 

A.2.2  Finding  ARL  for  shifts  in  ii  and  A:  ARLl  -  8.f 

The  routines  ARLl.f  — ARL4.f  find  tables  of  values  for  a  shift  in  the  mean  of  the 
IG{p,  A)  distribution.  The  user  enters  the  in-control  distribution,  the  out-of-control 
timing  value  for  the  mean,  and  a  range  of  h  values.  The  programs  compute  the  in¬ 
control  and  out-of-control  values  of  the  ARL  for  both  conventional  and  fast  initial 
response  schemes,  and  write  them  to  a  file  for  importation  into  a  graphics  program. 

These  routines  allow  the  user  to  see  the  trade-offs  in  using  each  of  the  schemes, 
and  how  the  ARLs  in-control  tend  to  grow  non-linearly  with  increasing  h,  while  the 
out-of-control  ARLs  tend  to  grow  linearly. 

Each  of  these  routines  is  for  a  point  alternative,  or  a  one-sided  CUSUM  scheme. 
ARLs  for  the  two-sided  schemes  must  be  found  by  combining  the  ARLs  as  described 
inEquationl.il. 

ARLl.f  finds  the  ARL  for  an  upward  mean  shift  using  the  optimal  scheme. 
ARL2.f  finds  the  ARL  for  an  upward  mean  shift  using  the  arithmetic  mean.  ARLS.f 
find  the  ARL  for  a  downward  mean  shift  using  the  optimal  scheme.  ARL4.  f  finds 
the  ARL  for  a  downward  mean  shift  using  the  arithmetic  mean. 

ARL5.f  through  ARLS.f  are  the  same  as  ARL1.4  through  ARL4.f,  except  they 


APPENDIX  A.  STATISTICAL  COMPUTING 


155 


are  for  shifts  in  A. 

These  routines  are  available  from  the  author  upon  request. 

A.2.3  Finding  decision  limits  for  a  CUSUM  scheme. 

The  optimal  scheme  in  this  thesis  tells  us  what  k,  the  reference  value,  should  be.  We 
still  have  to  determine  h,  the  reference  value,  to  meet  a  desired  ARL  for  the  given  k. 

We  used  a  modified  Newton-Raphson  method  with  a  difference  quotient  in¬ 
stead  of  the  exact  derivative  to  solve  for  h.  We  used  the  CUSARL  code  to  give  us  the 
ARL  of  an  arbitrary  h  as  f{h)  =  ARLh,  and  then  we  found  the  root  of  the  equation 
f{h)  —  ARL  =  0  for  our  desired  ARL. 

These  programs  are  named  FINDHl.f  through  FINDHS.f,  and  correspond  to 
the  cases  for  the  ARL  code. 

These,  too,  are  available  from  the  author. 


A. 3  Variance  reduction  techniques  for  IG  ARLs 

with  code 

We  needed  a  check  on  our  programming  for  the  FORTRAN  routines.  We  developed 
simulation  routines  to  find  ARLs  for  our  schemes.  We  used  a  clever  variance  reduction 
scheme  developed  by  Jun  and  Choi  [1991],  and  applied  it  to  the  inverse  gaussian 
distribution.  The  technique  uses  total  hazard  as  a  control  variate.  The  routine  is 
available  from  the  author  as  IG-ARL.LSP. 


A.4  Data 


The  data  files  used  in  this  thesis  are  available  from  the  author. 
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